As a research project I think Perspective is really cool. I also think machine learning like this can be valuable when it's used to flag things that might need human review. But on its own, it seems unlikely to be useful.
I tried pasting various Cake posts into the demo at the website you linked to, and results varied pretty widely. Even though none of the posts I picked were even close to what I would consider toxic, Perspective flagged several of them as either "likely toxic" or "not sure".
The use of swear words, even in positive or neutral contexts, seems to be a big factor in making it think something is toxic. It also has no ability to distinguish between original content and quoted text, so for instance it rated your own post here (the one I'm replying to) as "not sure" — 53% likely to be toxic.
Toxic comments can also be disguised somewhat by padding them out with a ton of non-toxic text. For instance it confidently rated the text "fuck you" as 99% toxic. But "fuck you" followed by a copied and pasted Wikipedia article about football was only 67% toxic, putting it into "not sure" territory.
I think we still have a long way to go.