From my, admittedly limited, understanding of the process, the way AlphaZero did it was a two phase process. First phase consisted of playing a bunch of games against itself, making completely random (but chess legal) moves. Second phase consisted of letting a machine learning algorithm work on finding patterns in moves leading to victory, similar to the way neural networks train to find images of cats. It ended up with atrained model that doesn't have to brute-force examine all the possible moves (like Stockfish does) - it can dismiss vast number of possible moves easily (none of them ever led to winning games).
But the crucial part in its 'out of this world' style of play was the random nature of moves in the first phase. Its the reason why it avoids all the common styles of play humans have - humans all learned from other humans. AlphaZero didn't, so it gave every move a fair chance and ended up with lots of winning moves humans would outright dismiss because "that's not the way you play chess".