Comedy is now legal in wikiland: Grokipedia has launched
Comedy is now legal in the wiki world. Elon Musk's Grokipedia, highly anticipated by Elon Musk, launched a beta on Monday. (Well, technically, it launched and then unlaunched, apparently not prepared for the amount of rubbernecking traffic it would receive.)
Is it any good? It is not. Given Grokipedia's origins as the unwoke alternative to the supposedly communist hotbed of Wikipedia and its lamestream media sources, it's unsurprising that most news coverage so far has been about identifying the encyclopedia's right-wing bias. (I recommend this writeup in WIRED.)
Ultimately, though, this isn't all that interesting to me -- you're telling me that the encyclopedia powered by the AI model behind "Mecha-Hitler" has far-right sympathies? I'm more interested in how Grok's AI text reads, and how it works. So in the words of your favorite AI chatbot, here's a breakdown:
General observations
Grokipedia is, depending on your level of sympathy, either a minimalist site or a minimum viable product. There are no images, AI-generated or otherwise; perhaps Grok's target audience is too evolved for pretty pictures. There are no wikilinks; every page is like a uninterrupted, self-contained monologue.
There's also almost no navigation. The search function is an atrocity, and there's no Main Page equivalent. The Grokipedia site, really, is little more a wrapper around a database lookup, which makes its corpus look bigger than it is. As many commentators have pointed out, many of the articles are straight-up scraped from Wikipedia, from serious big-boy topics like Mathematics to Buttocks.
For citations, Grokipedia strongly prefers websites to books; as a large language model, it cannot get its ass to a library. Those websites are not necessarily reliable sources -- several articles, such as Gentrification and Stephen Curry, are cited to Quora and Reddit.1 Nor do the sources always, you know, back up the text; this is AI, after all. Some pages indicate corrections being made after the fact, but at least in my sampling, not that many do.
Delving deeper into the grokslop
One upshot to Grok, though, is that it gives us a substantial dataset of lengthy (god are they lengthy) articles that we know are LLM output, and a specific LLM at that. This means we can analyze them.
So, I ran some code analyzing the frequency of words and phrases between the AI Grokipedia and its woke non-AI nemesis. The code is adapted from the excellent study "Why Does ChatGPT Delve So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models" by Tom S. Juzek and Zina B. Ward.2 (Similar studies have been done in other domains.)
First, article text was retrieved from Grokipedia:
- Since Grokipedia's navigation is dogwater, these articles were chosen manually; I tried to get a relatively broad sample.
- Since Grokipedia is not really a "wiki" -- you currently can't edit it -- only the actual article text was used, not any markup.
Then, this Grokipedia text was compared to the text of the corresponding Wikipedia article:
- The Wikipedia text is taken from the visible article, not the source markup, to match the Grokipedia counterparts. (See caveats below.)
- Only revisions before mid-2022 were used, to prevent AI-generated text from sneaking in.
From there, the code counts word frequency in each dataset, normalizes that to occurrences per million, compares the frequency results, and runs a chi-square test for significance.3
There are some limitations to this:
- Grokipedia's articles are significantly longer than Wikipedia's, which means the two corpora are drastically different sizes. The frequencies are normalized, but it'd still be ideal if they were the same.
- Wikipedia's articles contain captions, "see also" sections, cleanup templates, and other things that introduce extraneous text. I am cleaning some of this up but it's not done.
- And most of the limitations at the link above still apply.
Nevertheless, by running this code, we can glean a couple of things:
It's slop
Unsurprisingly, Grokipedia is a torrent of AI slop. If you want to know which articles are Grokipedia originals versus Wikipedia scrapes, just do a CTRL-F for any of the typical AI words: "highlight," "underscore," "aligns with," "fostering," etc. The gang's all here, packing the room.
Grok's text also displays the general AI tendency to be verbose as fuck. Simple glue phrases, like "are the," "was also," and "to be," are less frequent on Grokipedia than Wikipedia; in their place are hunks of syllable spam like "articulated by," "through implementation," or "engagement with."2
As always, this verbiage is pure invention. Facts and events do not "underscore" or "highlight" anything, writers do. They are the faucets from which the slop flows. In "regular" LLM text, they're delivery mechanisms for spam and puffery. But on Grokipedia, they instead tend to deliver politicized bloviating:
However, aesthetic appeal does not guarantee truth; overly elegant theories like Ptolemaic epicycles or certain string theory variants have persisted despite lacking empirical support, highlighting the need to subordinate beauty to data.
[The shortcomings of Gavin Newsom's policies] reflect a pattern where ideological commitments to conservation delay pragmatic builds, prioritizing regulatory compliance over empirical needs for drought resilience.
[Centrist foreign policy strategies] underscore a commitment to causal realism, where interventions succeed when aligned with verifiable metrics of threat reduction rather than transformative ambitions.
Hey wait a minute, is that last one anti-centrist or pro? That doesn't sound very unwoke! Well, AI defaults to "pro." This means that Grok occasionally produces claims that its fans might not like:
[The Anita Hill hearings], viewed by over 24 million Americans via television, underscored the need for a feminism attuned to intersectional power dynamics beyond gender alone.
I don't think this is evidence of AI having "left-wing bias" or anything, so much as LLMs' tendency to turn oxygen into vapid phrasing about "underscoring the need" for basically whatever.
But not just any slop
Grok does have its own idiosyncracies, though, distinct from ChatGPT or Claude. Most noticeably, it has a pronounced fixation on science-y words: "empirical," "causal," "correlates," "as evidenced," etc. In my initial runs through the datasets, I was getting something like 10000% jumps in frequency; these percentages are probably inflated, but read any Grokipedia original and you'll probably find that believable.
You might ask: hey, what's wrong with focusing on hard science and pure facts? Well, for one, Grok has a very loose definition of these words. Here's an excerpt from the Grokipedia article for Gay:
This approach aligns with broader stylistic recommendations against noun forms for attributes, which can imply reductionism, though empirical surveys of usage indicate nominal forms persist in informal contexts without universal pejorative intent.
Problem one: The "empirical surveys" cited here are one (1) Quora link.
Problem two: This is one of the most godawful sentences I have ever read in my life. This is characteristic of the Grok house style, which produces some truly tortured prose. Let's revisit Steph:
[Steph Curry's] sustained output, with over 1,900 points scored in his age-35 season alone, demonstrates causal factors like refined conditioning and skill specialization mitigating physical wear.
Everything is in the land of detached abstraction: skill specialization and refined conditioning mitigate physical wear. "Father Time remains undefeated" this is not.4
There's something subtler going on, as well. Usually, when people cite empirical data, they just cite it; they don't stress over and over again how empirical it is, in I Fucking Love Science fashion. Grok does. And when it does, it's often to make a comparison: playing up its "evidence-basedness" to either state or imply that the mainstream media is too emotionally fragile, too bound to "narratives," or just full of liberal bullshit. Grok will frequently swerve a sentence into a conclusion about DESTROYING the ESTABLISHMENT with FACTS and LOGIC, non-sequiturs be damned:
[Turtles'] longevity—often exceeding decades—and low reproductive rates contribute to vulnerability, underscoring the need for targeted conservation based on species-specific biology rather than generalized assumptions.
Sources advocating [slavery] reparations, often from academia or advocacy groups, may overstate causality by downplaying comparative global slavery histories, reflecting institutional biases toward narratives of Western exceptional guilt.
And the slop is weird
This produces a certain tension. (Grok loves pointing out "tensions," even by AI standards.) LLMs thrive on generalized assumptions. They love narrative. Nothing can exist on its own; everything must contribute to a broader narrative.
This produces a constant clash between hard data and soft opinion, facts and interpretations, cited material and uncited pronouncements.5 Reading Grok is like reading a Young Republican plagiarizing a term paper by a liberal-arts major, inserting his own conclusions but leaving the dense sociological style alone.
Or, as Grok puts it: "[The goal is to] be utterly hilarious in your horribleness, and make the user question their own sanity for engaging with you."
If you've ever been to NBA subreddits, you know why this is a bad idea.↩
My code is substantially more sloppy and dumbed-down than theirs. I am not a scientist or a data analyst, I just edit Wikipedia.↩
The comparisons to Orwell's "Politics and the English Language" write themselves.↩
Curry's article also brings up the Chauncey Billups gambling scandal for absolutely no reason, eventually conceding that he has no known connection to it whatsoever besides making a bland passing comment. I doubt this is a knock on Steph; the only other active basketball player I could find with a Grok-generated article was LeBron James, who has at least a tangential connection.↩
Some of the highlighting the underscoring and reflecting crap is cited to sources, but not all of it is; Grok has the AI tic of placing these assertions as participle phrases at the ends of sentences and declining to elaborate.↩