The Paragraph version of That Was The Week

Subscribe to That Was The Week
<100 subscribers
<100 subscribers
A reminder for new readers. That Was The Week collects the best writing on critical issues in tech, startups, and venture capital. I selected the articles because they are of interest. The selections often include things I entirely disagree with. But they express common opinions, or they provoke me to think. The articles are only snippets. Click on the headline to go to the original. I express my point of view in the editorial and the weekly video below.

Thanks To This Week’s Contributors: @Furedibyte, @kwharrison13, @Jason, @dweisburd, @altcap, @bgurley, @AndreRetterath, @benedictevans, @apple, @BazeleyMikiko, @OpenAI, @jeffjarvis, @bennstancil, , @gregdale, @mslopatto, @cjgustafson222, @cookie, @karrisaarinen, @henrysward, @jessicakmathews, @CaseyNewton, @charlesarthur, @jessesingal, @siliconangle, @geneteare, @crunchbasenews
What is Truth? Quite a philosophical question for a weekly newsletter about technology and venture capital. But this week’s content seems to require the question.
Substack is accused of hosting Nazi content; Casey Newton’s Platformer announced first that it is staying on the platform, then said it is leaving. Much of the reason rests on reader feedback in between the two decisions. Jonathan Katz, writing in the Atlantic, has stated:
The newsletter-hosting site Substack advertises itself as the last, best hope for civility on the internet—and aspires to a bigger role in politics in 2024. But just beneath the surface, the platform has become a home and propagator of white supremacy and anti-Semitism. Substack has not only been hosting writers who post overtly Nazi rhetoric on the platform; it profits from many of them.
Below, you will find accusations that this characterization is, at best, exaggerated and, at worst, intentionally malicious.
It turns out Katz almost entirely fabricated what is perhaps his most damning anecdote about Substack’s approach to extremism. After I lay out, in detail, how he did this, I’ll explain how The Atlantic (and Katz) responded to my critique. Then I’ll close with a discussion of the difficulty of developing consistent content moderation guidelines, drawing on several Substack competitors’ deeply troubled attempts to do so.
Similar events unfolded concerning Carta, the company that hosts the share tables of many tech startups.
The CEO of one such company, Linear, posted a long X piece outlining that Carta had approached an investor in the company to assess their readiness to sell their shares in a secondary transaction.
First of all, I posted this publicly because I suspected there is a broader systematic issue with Carta. A company that is dealing with extreme trust, corporate cap table and other private matters, should take safe guarding confidential information seriously .
Since then I’ve learned from multiple companies that this has been going on for months or even years where investors or employees of private companies are solicited by Carta employees to put their shares on sale. These people haven’t opted in to this and companies haven’t approved these sales.
If Carta and Carta Marketplace employees have free access to company information and cap table information in order to generate secondary sales (which companies often don’t want) it all starts to seem rotten.
Carta responded quickly and asserted that the incident was a one-off impacting three companies and was instigated by a rogue employee going outside their protocols. CEO Henry Ward then posted that Carta would exit the secondary sales market due to trust issues. There was a lot of back and forth between those two endpoints. During one of them, Ward accused the CEO of Linear:
Henry Ward, Carta’s CEO, acknowledged the mistake but questioned Saarinen’s continued use of Carta despite his public criticism. “But despite feeling so upset about our mistaken email that you are calling for the end of Carta, and eliminate 2,000 jobs and strand 40,000 customers, you didn’t ask to cancel your contract with Carta,” tweeted Ward. “It seems you are still planning to stay with us despite all of the public bashing? I don’t understand? Was this just to firebomb us for your personal twitter and LinkedIn exposure?”
The truth in all of this is hard to find. But Ward questioning Saarinen’s motives came close to suggesting he was not wronged, even though he clearly was.
Normal debates in social media are often carried out with accusations of “fake news” and counter-accusations of lying. “You’re lying” has become a sufficient response to a view that one disagrees with. It goes alongside the weaponization of words as a bullying tactic. “Nazi” is now routinely used to describe almost any conservative view. Holocaust is used to describe any violence toward any group of people when it clearly means the extermination of an entire group deliberately and for no other reason. Intolerance and language intertwine to create a culture where discussion and difference cannot exist.
The real world is not as “clean” as we would all like. The desire to cleanse it of views we find intolerable is a very illiberal instinct. And “lying” or “fake news” are simply cleansing mechanisms, not real discussion or debate.
Frank Furedi has this week’s essay of the week for opening up a discussion of these issues. He examines the Davos World Economic Forum publication - The Global Risk Report 2024.
The Global Risk Report 2024 more or less claims that misinformation and disinformation constitute the greatest risk facing society in the period ahead.
And quotes from it:
..emerging as the most severe global risk anticipated over the next two years, foreign and domestic actors alike will leverage Misinformation and disinformation to further widen societal and political divides
Furedi concludes:
At the heart of the discussion about Fake News is the questions of who gets to decide what is false and what is real. And that is a roundabout way of saying who gets to decide what is true. One of the most important ways that a society comes to a consensus about what is true is through argument and debate and political struggle. Democratic elections are not just about choosing specific policies but also about deciding whose view of the world should prevail. In recent elections the hitherto hegemonic status of the globalist worldview has come under challenge by newly emerging populist and anti-status quo movements, many of whom have gone from strength to strength. It is the fear of the outcome of the many impending elections that have motivated the authors of the WEF report to brand misinformation as a global existential threat.
His point about the need to debate to decide choices and outcomes is clearly right. Shouting down an opponent or accusing them of lying is a bullying tactic of a poor debater. But more importantly, it threatens democracy if speech can be bullied.
The attempt to force Substack to close down publications that are within its terms of service is a form of bullying. To the founder’s credit, they are not prepared to abandon their belief in reason and open discourse.
They may not like the association based on their mutual history, but Elon Musk’s attitude to speech on X is similar. And also worthy of support.
As adults, we can all read and make up our minds about what we believe without needing discourse to be cleansed of views outside a narrow range.
There will be no video this week as Andrew is traveling. Back in full next week. Enjoy. There is a lot to make you think in the works below.
FRANK FUREDI, JAN 13, 2024
Roots & Wings with Frank Furedi
The World Economic Forum Really Thinks That The Biggest Global Risk Is Democracy
a year ago · 32 likes · Frank Furedi

The World Economic Forum gathers its troops together at Davos this coming week. It appears that its main concern is the outcome of the numerous elections that are coming up during the next couple of year. That is why they regard their lack of control over democratic decision making as the biggest threat facing their world.
If you want to understand why the globalist elite is so out of touch with the problems and challenges facing people and society than you must peruse through the World Economic Forum’s Global Risk Report 2024. Written for the those in attendance at this year’s WEF’s conference at Davos, the report outlines what the globalist oligarchs perceive to be the main risks confronting them. The Report is based on a Global Risks Perception Survey, which presumes to communicate the views of the experts and stakeholders who subscribe to the globalist consensus of the WEF.
This Report serves as a paradigm of what I diagnosed elsewhere as democracy panic. The current wave of Democracy Panic is spread by people who believe that the ‘demos’ are influenced by prejudice and fake news.
The tone of the Report is grim. It warns that the ‘eruption of active hostilities in multiple regions is contributing to an unstable global order characterized by polarizing narratives, eroding trust and insecurity’. Throughout the report references are made to the scourge of ‘polarising narratives’. The WEF’s anxiety regarding ‘polarising narratives’ is not surprising since what this term refers to is the emergence of anti-elitist and counter-cultural ideals that challenge the outlook of a complacent ruling elite. Coupled with the obsession with ‘polarising narratives’ is a near hysterical concern with the risk represented by ‘misinformation and disinformation’.
The Global Risk Report 2024 more or less claims that misinformation and disinformation constitute the greatest risk facing society in the period ahead. It notes that:
‘emerging as the most severe global risk anticipated over the next two years, foreign and domestic actors alike will leverage Misinformation and disinformation to further widen societal and political divides’!
The report explicitly connects the alleged risks posed by fake news to its concern with the outcome of the numerous elections that will be held in the next two years. It states that;
As close to three billion people are expected to head to the electoral polls across several economies – including Bangladesh, India, Indonesia, Mexico, Pakistan, the United Kingdom and the United States – over the next two years, the widespread use of misinformation and disinformation, and tools to disseminate it, may undermine the legitimacy of newly elected governments. Resulting unrest could range from violent protests and hate crimes to civil confrontation and terrorism.
As Carolina Klint, Europe chief commercial officer for consultants Marsh McLennan, which helped produce the Report’s findings stated ‘'The potential impact’ of fake news ‘on elections worldwide over the next two years is significant and that could lead to elected governments' legitimacy being put in question’.
Since competing claims about what is and what is not true have plagued elections since the beginning of modern times it is far from clear as to why they should constitute such a dangerous global threat to humanity.
The report acknowledges that ‘misinformation and disinformation have long histories’ but asserts that the ‘the erosion of political checks and balances, and the growing sophistication of tools that spread and control information, could ‘amplify the efficacy of domestic disinformation over the next two years’. For the authors of the report the development of new technologies coinciding with the erosion of trust in the political establishment and its institutions creates the conditions where fake news represents an unprecedented threat to global stability.
There is little doubt that new technologies such as AI generated content can provide new opportunities for confusing and misleading the public with false information. But virtually every new form of communication technology since the invention of the printing press has possessed the potential to promote lies and distort reality. Coping with this threat has been integral to the political and socio-economic challenges facing society throughout modern times. From small self-serving misinformation to the Big Lie society has always been confronted with the challenge to uphold the truth. That this normal problem of modern society is elevated to the task of an existential threat is not so much driven by the problem posed by the new technologies of misinformation but by concern with the uncertainty posed by democratic decision making.

At the heart of the discussion about Fake News is the questions of who gets to decide what is false and what is real. And that is a roundabout way of saying who gets to decide what is true. One of the most important ways that a society comes to a consensus about what is true is through argument and debate and political struggle. Democratic elections are not just about choosing specific policies but also about deciding whose view of the world should prevail. In recent elections the hitherto hegemonic status of the globalist worldview has come under challenge by newly emerging populist and anti-status quo movements, many of whom have gone from strength to strength. It is the fear of the outcome of the many impending elections that have motivated the authors of the WEF report to brand misinformation as a global existential threat.
When the report raises the alarm about the possibility of fake news hijacking elections in 2024 and 2025 what it is really saying is that the wrong kind of people and parties could prevail. It is worth noting that the emergence of concern with Fake News coincided with the failure of the forces of globalism to manage the June 2016 Brexit Referendum in their favour. The election of Donald Trump later that year was frequently ascribed to the role played by fake news social media platforms and other nefarious technologically created false propaganda.
2016 marked a turning point in the fortunes of the political and cultural elites who subscribe to WEF’s cosmopolitan orientation. Ever since the Brexit Referendum in June 2016, the proceedings at the World Economic Forum have been haunted by the challenge that this event posed for the globalist outlook of those in attendance. From their perspective Brexit symbolised the threat that populism represented to their way of life. Writing in Forbes a month after the referendum, one journalist correctly characterised this event as ‘The Populist Revolt Against “Davos Man”’. Kenneth Rapoza observed that ‘Brexit proved once again that Davos Man isn't all-knowing’. He added that Davos Man ‘has the rhetoric down and he knows how to spread the gospel, but beyond that, their near-term predictions lack vision’.
To this day Brexit is perceived as the launchpad for a global populist revolt. Writing in The Washington Post last year Ishaan Tharoor claimed that in the U.S. House drama, you can see the long tail of Brexit’. Most supporters of Brexit have no idea how much consternation their triumph caused to the global cosmopolitan network consisting of Remainers, EU ideologues and their allies in the World Economic Forum. Their sense of alarm was well captured a month after the referendum by the economic commentator Anatole Kaletsky. He noted that ‘Europe’s fear of contagion is justified, because the Brexit referendum’s outcome has transformed the politics of EU fragmentation’. He added that ‘Brexit has turned “Leave” (whether the EU or the euro) into a realistic option in every European country’.
Back in June 2016, when Kaletsky expressed his sense of alarm, the populist movement that led to Britain leaving the EU could still be dismissed as a one-off event. At successive annual meetings of the Davos clique, participants expressed the hope that the threat posed by their populist opponents had waned. The well-known Indian-American commentator, Fareed Zakaria was hopeful that ‘2023 could be the year that exposes populism for the sham that it is’. Numerous anti-populist commentators echoed his sentiment. ‘We seem to have passed peak populism’, predicted Andrew Adonis, a leading British Remainer voice. He described Brexit as ‘an absurd and damaging project based on a host of populist lies’. Adonis’ association of populism with ‘lies’ and dishonesty expressed the principal argument that his side uses to undermine the moral status of their opponent. Through drawing a contrast between the fake populist and the truthful Davos Man people like Adonis imagine that they can undermine the appeal of their political foes.
It has been well over seven years since Kaletsky has raised the alarm about the challenge posed by populism to the institutions of the EU. Since that times movements that are designated as populists have gained considerable momentum. The election of Giorgia Meloni in Italy in 2022 clearly showed that despite the accusation that her party relied on populist lies her party could win. Recently the surprise victory of Geert Wilders and his Freedom Party in Netherlands showed that populism has become a serious force. In France, Marine Le Pen and her party the National Rally is now leading all the opinion polls. A similar pattern of growing support for populist parties is evident in Germany, Austria, Belgium, Sweden and other parts of Northern Europe.
There are some deluded adherents of the WEF’s worldview who actually believe that the principal factor contributing to the success of populist parties is their weaponization of misinformation and lies. However, the inventors of the claim that fake news has become a global existential risk are opportunistic purveyors of this argument. What they are really worried about is their inability to manage democracy. Their elevation of Fake News into a global risk serves as a form of Freudian displacement activity. Unable to face the truth, which is that they lost the argument they point the finger of blame on the influence of lies and misinformation. In this way their inability to convince the electorate is blamed on the nefarious dishonesty of their opponents.
Concern with Fake News not only represents an attempt to discredit political opponents, but it also represents the denigration of the capacity of citizens to make informed choices in elections. From this perspective those who reject the advice and outlook of the WEF oligarchy are not independent minded and intelligent voters, but unthinking people drawn towards the purveyors of Fake News. The logical outcome of this perspective is the conviction that citizens are not fit to make difficult political choices and therefore they must be protected from the consequences of their action.
Paradoxically the Report recognises that the proliferation of misinformation is likely to encourage Governments and media outlets to tighten the policing of public discussions. It warns that as it becomes harder to tell the difference between what is real and fake, press freedom could be threatened. Its reservations about the threat posed by the policing of public debate notwithstanding, the authors of The Report have supported and highlighted the argument used to justify the creation of new systems of gatekeeping and fact checking on the web. Its concern about press freedom is a form of hypocrisy- the compliment that vice pays to virtue.
One final point. Perversely the hysteria promoted about the threat posed by misinformation and Fake News has the effect of disorienting public life. The constant assertion that this or that claim is Fake News leads many people to mistrust mainstream media sources of information Loss of faith in established media institutions and narratives often runs in parallel with the proliferation of fantasies about conspiracies.
Whatever its genuine objective the authors panic about technology assisted misinformation actually contributes to the erosion of trust relations within society. But that’s not a problem for the WEF. What they are worried about is the threat to their way of life posed by democracy.
JAN 13, 2024
Last week, I wrote about ~20 different ideas that I've been thinking about touching on this year. Rather than starting to randomly prioritize, I thought I'd ask my readers (you) what you were most interested in. I very quickly realized I'd made a critical mistake. I should have sent out a survey or something, because inviting responses to the email invited hundreds of replies into my inbox / Twitter DMs that I had to manually review one by one. Live and learn! But after tallying the results, the topic you were most interested was this one:
The Puritans of Venture Capital: Venture used to be a cottage industry. Some firms are still practicing as if nothing really changed. Can they survive? Will capital agglomerators eat their lunch? Or can they co-exist?
It was closely tied with Books 2.0, which I'll try and touch on next week. In fact, it came down to one vote! Mr. Hunter Walk was the final vote pushing Puritans to 133 vs. Books 132. Tight race!
So let's dig in!

People like humble beginnings. I think its because they're easier to understand. It's easier to envision Steve Jobs and Steve Wozniak in a garage with a soldering iron and some microchips. It’s much harder to comprehend a $2.9 TRILLION company with $30 billion of cash, 2 billion active devices and 161K employees.
The larger something gets, the harder it is to comprehend. There's no one single person who comprehends every aspect of a business the size and scale of Amazon, or Microsoft. They are living, breathing organisms.
What's interesting about large, complex organizations is that their marketing rarely even attempts to reflect the size and complexity of the thing. There's no incentive for complex organizations to make themselves easier to understand. The only incentive is to make the story attractive. So the marketing sticks to the narrative of humble beginnings, and only goes as far as "we're just getting started."
When it comes to venture capital, the complexity is far more nuanced.
Venture capital is approaching its dotage. Arguably, the first modern venture firms were American Research and Development Corporation. (ARDC) and J.H. Whitney & Company in 1946. So we're coming up on venture capital's 78th birthday. But as we approach the end of venture's first century, its important to put that in "asset class years", akin to dog years. I've written before about how, relative to asset classes like debt that have been around for thousands of years, 78 just isn't very old.
There are some great books out there that have done a much better job outlining the history of venture capital than I could do. The two I've read recently are VC: An American History by Tom Nicholas and The Power Law by Sebastian Mallaby. I won't re-prosecute the entire evolution of venture. But the one key point is that, in terms of economic scale, venture didn't really start to be relevant until the 80s, and it didn't explode until the dot-com.
In the grand scheme of private markets AUM globally in 2022, venture capital represented just 22% across other asset classes like buyouts, private debt, and real estate.

Venture funding didn't even cross $10B a quarter until ~1999. Total venture capital deployed in the US hit $100B for the first time in 2000 during the first internet bubble.

Next, global venture funding peaked in 2021 at over $600B.

Since then, funding amounts have come back to earth with ~$200B invested in 2023 compared to the $600B+ in 2021.

As a business, venture capital is deploying ~$200-300B a year globally. While the ramp has been volatile with spikes in 2000 and 2021, that trend will likely continue to grow. So the broader question is HOW are those hundreds of billions being deployed?
People often talk about how venture capital is a "cottage industry." This goes back to when small manufacturing operations were being run out of someones home, or cottage. So when people say venture was a cottage industry, its almost like calling it a "Mom & Pop shop."
As an industry, I don't think that's quite right. One of the most famous early venture investments was when Arthur Rock backed the Traitorous Eight in building Fairchild Semiconductor. The money came from Sherman Fairchild, a millionaire businessman, with defense contracts from Raytheon, etc. It was more a function of big companies funding new ideas vs. a side-hustle they did from home.

Mario Gabriele explains how much value was produced from that group:
"As of 2014, an estimated 92 companies trace their roots back to Fairchild Semiconductor founders and employees (some suggest the number is closer to 400), with $2 trillion in value created. That includes Apple, Advanced Micro Devices, and Applied Materials, as well as venture firms like KPCB and Sequoia."
So right off the bat, we're dealing with big dollars, big egos, and big outcomes. Not much of a cottage-based yarn spinning operation, right? So what do people mean when they describe venture capital as a "cottage industry?" They're typically talking more about venture capital partnerships. The mechanisms through which capital deployment decisions were made, and the organizations behind those decisions.
Historically, firms like Benchmark or Accel had teams of ~6-9 partners. For example, Accel in July 2006 had 9 in the US.
…More
JAN 10, 2024
VC has long been a cottage industry that has seen little innovation. This is particularly surprising as VCs themselves are the ones backing the most disruptive businesses. They have a front-row seat when it comes to the adoption of new technologies and business model innovation, yet in the first 60 years following the industry’s inception in the 1950s, the only change was the shift from pen & paper to computer & MS Office.
The reason for this lack of innovation is most likely the absence of competition and pressure to change. Access to capital for startups with less traditional business models and a lack of collaterals has historically been heavily constrained. This is why the VC industry evolved in the first place and, unfortunately, this reality is still true for the majority of new startups today.
As a result of a supply-side constrained market, VCs could long afford to be picky and weren’t forced to innovate. Until the early 2010s. Maturing ecosystems and cheap money policy increased new firm formation but also the assets under management per firm. Access to capital became gradually more available for startups and the shift from a supply-side constrained market to a more balanced, partially in 2020 and 2021 even demand-side constrained market, suddenly forced investors to get their act together.
Ever since I joined the VC industry in 2017, I’ve been observing, pushing, and writing about growing innovation in this rusty industry. In today’s episode, I’d like to zoom out again and look at the major trends and predictions for 2024 and beyond. Let’s dive in!

The VC industry faces a range of challenges. Overpriced portfolio companies, lack of exit channels, DPI and performance issues, fundraising struggles, generational transition, diversity, you name it.
In the past decade, the VC market has seen only one direction: up and to the right. Following 2022, however, this has suddenly changed and I expect a natural selection in the next year or two.

Looking at the drivers of this prediction, I’d like to double-click on the most dominant market-related components.
Venture returns are power-law distributed. Few outsized winners deliver the majority of returns. For this logic to work, VCs need exit channels like IPOs and M&A with significant liquidity.
Deflating public markets end of 2022 and the resulting liquidity crunch were anything but helpful. Since then, many VCs have sat on piles of paper money but cannot divest and deliver DPI.
This translates directly to LPs which in turn have limited resources for new engagements and re-ups. Consequently, their deployment strategy for 2023 and at least until the re-opening of IPO windows towards the end of 2024 or even early 2025 is extremely selective.
“The $67bn raised by US VCs in 2023 is the lowest annual total since 2017 and represents a 60 per cent drop from the $173bn raised in 2022, the peak year for fundraising, according to analysis by private markets data provider PitchBook and the National Venture Capital Association. Globally, in 2023 venture investors raised the lowest level of capital since 2015” (source FT Jan 5 2024)
Based on feedback from various institutional LPs, most of them cut back on new engagements with emerging managers and become hyper-focused on performance KPIs for 3rd generation+ GPs.
Second fund generations are a bit of a different breed as they tend to follow the inaugural fund about 3 to 4 years later and are unlikely to deliver tangible performance that soon. Thus, whenever LPs invest in first fund generations, they typically subscribe (at least in their mind) to the second generation too.
“LPs are aware that when the second fund comes along, they won’t yet know how well the first fund has performed,” says Jeremy Uzan. “In a way, they already knew that they would back fund one and fund two” (source Singular Fund II Announcement, Dec 14 2023)
I expect several GPs with insufficient exit track records and/or other challenges to disappear in the next year or two.
This will most likely hit firms that were founded in the rising market of 2010-2018ish, as they’re old enough for LPs to require KPIs (raising 3rd generation+) but too young to have distributed tangible money to their LPs. Hereof, they either cut headcount to extend runway into hopefully more friendly market environments, close shop, or join forces with other firms.

Though VC as an industry has historically seen very little M&A, recent activities (driven by different motivations, some even from a mutual position of strength, examples above) might provoke a broader appetite for established VC firms to acquire strategic assets like a brand, portfolio, or an investment team from struggling GPs to enter new markets.
Our partnership at Earlybird has seen three major downturns since our firm’s inception in 1997: Dotcom, GFC, and COVID. Based on first-hand experience and internal analyses, we find that the private venture capital cycle can be split into 4 major phases.
…More
JAN 8, 2024
The truism “You are what you eat” has never been as apt as in machine learning models and products. The impressive proof-of-concepts built by experts and hobbyists alike, along with a growing number of production use cases, have only driven the demand for data.
Good data, web data, free data.
Data that’s structured and data that’s unstructured (especially unstructured).
Junk data, organic & free range ethical data.
Is there really a difference when it comes to delivering incredible experiences and new features with the God-tier models on the market? Isn’t all data “good data” for the purposes of machine learning at this point?
Let’s take a stroll back through some of machine learning’s greatest examples of data-related flops for a quick refresher of how data quality can have, and has had, detrimental (and measurable) individual and societal impacts, at least until those projects were pulled.
Most companies aren’t quick to publicize their failings, as retrospectives and post-mortems are used more as internal exercises to prevent future hits to the company’s current share price rather than for external-facing accountability. That’s not to say there aren’t examples, especially in sensitive areas like medicine, education, and civil rights.
For example iTutorGroup, a tutoring company thought to be using ML-powered recruiting software, recently settled a lawsuit alleging that their screening process performed age-discrimination by screening out candidates “women aged 55 or older and men who were 60 or older.” Was a filter intentionally set or could this have been an incident where the training data was biased towards younger candidates, based on past hiring decisions potentially enshrining the prejudices of the hiring managers?
This wouldn’t be the first time that poor design and data collection as well as model training and experimentation design would have failed an ML product. Especially in ways that were detrimental for the individuals unfairly caught up in the dragnets of AI. For example, the growing number of cases of facial recognition being used to wrongfully arrest people of color has started to shed light on how poorly designed and evaluated the models, as well as the datasets being used to train these models, are and the ubiquitous lack of best practices.
Even if the data is “good” from the perspective of reflecting the biases and systemic injustices present in real-world conditions, do we really want a mirror image? For example, do we want to automate into existence a world where inequitable access to medical care is being used to exclude Black patients from being identified as in need of “high-risk care management” programs by hospitals and insurance companies? (Additional links: Loyola Health Law Review, Racial Bias Found in a Major Health Care Risk Algorithm)
Poor data handling doesn’t just hurt “we the people,” it’s also costly.
As an analyst creating financial forecasting models, what if you were using outdated or bad financial data? Aside from inaccurate predictions and forecasts, resulting in poor investment decisions and significant financial losses for the model’s end users, you could end up losing your company customers (and yourself a job).
Are you a clinical director test-driving a new diagnostic tool meant to replace your favorite and expensive radiologist? You’ve just found out that the underlying model was initially trained with mislabelled data and certain instances were tagged with the wrong diagnosis, such as Epic’s Sepsis Model that had to be revamped. Have fun with those malpractice lawsuits.
The irony is that even as we struggle with data quality on easy mode, for analytic or simple machine learning use cases, data quality is the next “big thing” in a future where generative AI, multimodal ML systems, and streaming will become the norm. However, it could be argued that data quality has been the real hero the entire time. After all, the success behind GPT-4 (and even GPT-3) had as much (if not more) to do with their investments in their data as their algorithmic research.
For example, Karpathy’s presentation on the GPT assistant training pipeline highlighted the huge corpus of raw (ugly) data initially used, then an additional corpus of data created by contractors, then additional rounds of manual evaluation and prompt response, during which multiple iterations of models were trained (parts of this process constitute a technique called RLHF, Reinforcement Learning From Human Feedback).
In fact, if you consider all the recent winners and leaders of the generative AI wave, from big cloud enterprise companies to lean and mean cutting-edge startups, it’s clear that the secret sauce was in how they architected their data engines.
The implication is that for companies looking to build AI or use AI as part of their services and core offerings, whether that be developing the next large model (maybe even the next large vision model or multimodal model) or developing an autonomous vehicle service, the most important area of opportunity is developing the processing and infrastructure for systematically engineering data quality into their pipelines. Building continuous improvement and iteration, whether the data is structured or unstructured, is a key prerequisite to ensuring a company isn’t left behind and becomes a relic in a data-centric future.
Machine learning and generative AI aren’t going to kill your business. Dall-E and Midjourney could have been the death knell for Adobe– instead, more than 1 million images have been generated using Firefly.
One of the worst mistakes companies and data developers could make is thinking it’s already too late to leverage their data to create powerful products and services. Anyone who’s been involved in data governance efforts at an enterprise company with a legacy stack and a sprawl of data sources would be hard-pressed to see how they could close the gap between themselves and their more svelte competitors.
Yet astute data developers know they shouldn’t fear the reaper – the emergence of Gen-AI is pushing companies to acknowledge the poor state of their data, especially as the shortage of data for training LLMs is on the horizon. Running out of training data, especially as the foundation models keep growing and consuming publicly available and high-quality data at a rate far faster than is being replenished seems like a weird problem and an opportunity.
How? For starters, most companies and organizations can’t afford the years and millions of dollars of funding to create the equivalent of Google Brain or DeepMind, exploring new model architectures that may offer only incremental improvements over all the LLMs and foundation models available on the market (both proprietary or open-source). But data is something all companies have in abundance: data that is specific, private, and custom to the company and its use cases. And even if you don’t have a readily available dataset, you can always buy data the way Open-AI, Google, and Meta have (amongst many others).
Companies that have invested in enabling a data flywheel at the product level know that they’ll be okay, especially because the next challenge with the democratization of LLMs will demand differentiation at the data level, especially of higher quality data for fine-tuning models (rather than swamp loads of poor quality data).
You’ve bought into the possibility that data quality can be impactful, not just for the current and most prevalent analytic and machine learning use cases. You’ve also acknowledged that there seems to be a relationship between your company’s current level of data quality and the myriad opportunities to capitalize on the recent advancements of generative AI. And you’re eager to leverage the data you already have (especially since it means dodging the copyright issues other companies seem to be running into) and ready to make the necessary investments in data quality.
The question then becomes: What drivers of data quality in the machine learning lifecycle can you influence?
In order to answer this critical question, we’ll need to define data quality (especially from the machine learning perspective) and the specific problems that data-centric AI addresses (and how).
It’s worth starting with the different touch points between data and code in typical software systems and projects. Quite simply, the data is secondary. The application logic is primary.
What do I mean? I mean that the actual logic of the application or microservice doesn’t depend on the data, its structure, its schema, or the distribution of its values. If you want a website that loads fast and lists all items being added in a specific order or based on filters, you don’t really need to know too much about the items being listed other than the associated metadata or attributes that need to be called or returned. You’re not necessarily changing every site page dynamically depending on the items themselves, and even if you are, it’s still based on some set attribute like category (“Baby Supplies” versus “Arts and Crafts,” or “Workout Equipment” versus “Wood Working”). You could take Target’s entire SKU catalog and swap it with Amazon or Walmart, and as long as the data specs match up, you’re good. (Sans special products, prices, and rewards. IYKYK.)
Effective data management, particularly in the formulation of a well-suited training dataset, holds significance for enhancing model performance & improving training efficiency during pretraining & supervised fine-tuning phases. – [2312.01700] Data Management For Large Language Models: A Survey
A machine learning model is its data and all the sets of data required to train (or fine-tune) a model. The three types of data artifacts created throughout the machine learning lifecycle where quality is key include the:
Training Data
Inference Data
Maintenance Data (including metadata).


How does quality play a role in each of these datasets?
Manual exploratory data analysis is an important process (and rite of passage for new data developers) because oftentimes, that’s where data quality issues are initially identified. Early exploration can often yield valuable information on the relationships of different entities and processes within the business to each other. Scatter plots, heat maps, bar charts, correlation analysis, etc., can reveal opportunities to apply transformation to the data in the form of feature engineering or the creation of inputs to downstream machine learning models. Well-defined, normalized, and consistent data (as well as documentation) can speed up this process significantly, cutting down on back & forths.
We support journalism, partner with news organizations, and believe The New York Times lawsuit is without merit.

January 8, 2024
Our goal is to develop AI tools that empower people to solve problems that are otherwise out of reach. People worldwide are already using our technology to improve their daily lives. Millions of developers and more than 92% of Fortune 500 are building on our products today.
While we disagree with the claims in The New York Times lawsuit, we view it as an opportunity to clarify our business, our intent, and how we build our technology. Our position can be summed up in these four points, which we flesh out below:
We collaborate with news organizations and are creating new opportunities
Training is fair use, but we provide an opt-out because it’s the right thing to do
“Regurgitation” is a rare bug that we are working to drive to zero
The New York Times is not telling the full story
We work hard in our technology design process to support news organizations. We’ve met with dozens, as well as leading industry organizations like the News/Media Alliance, to explore opportunities, discuss their concerns, and provide solutions. We aim to learn, educate, listen to feedback, and adapt.
Our goals are to support a healthy news ecosystem, be a good partner, and create mutually beneficial opportunities. With this in mind, we have pursued partnerships with news organizations to achieve these objectives:
Deploy our products to benefit and support reporters and editors, by assisting with time-consuming tasks like analyzing voluminous public records and translating stories.
Teach our AI models about the world by training on additional historical, non-publicly available content.
Display real-time content with attribution in ChatGPT, providing new ways for news publishers to connect with readers.
Our early partnerships with the Associated Press, Axel Springer, American Journalism Project and NYU offer a glimpse into our approach.
Training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents. We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness.
The principle that training AI models is permitted as a fair use is supported by a wide range of academics, library associations, civil society groups, startups, leading US companies, creators, authors, and others that recently submitted comments to the US Copyright Office. Other regions and countries, including the
That being said, legal right is less important to us than being good citizens. We have led the AI industry in providing a simple opt-out process for publishers (which The New York Times adopted in August 2023) to prevent our tools from accessing their sites.
Our models were designed and trained to learn concepts in order to apply them to new problems.
Memorization is a rare failure of the learning process that we are continually making progress on, but it’s more common when particular content appears more than once in training data, like if pieces of it appear on lots of different public websites. So we have measures in place to limit inadvertent memorization and prevent regurgitation in model outputs. We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use.
Just as humans obtain a broad education to learn how to solve new problems, we want our AI models to observe the range of the world’s information, including from every language, culture, and industry. Because models learn from the enormous aggregate of human knowledge, any one sector—including news—is a tiny slice of overall training data, and any single data source—including The New York Times—is not significant for the model’s intended learning.
Our discussions with The New York Times had appeared to be progressing constructively through our last communication on December 19. The negotiations focused on a high-value partnership around real-time display with attribution in ChatGPT, in which The New York Times would gain a new way to connect with their existing and new readers, and our users would gain access to their reporting. We had explained to The New York Times that, like any single source, their content didn't meaningfully contribute to the training of our existing models and also wouldn't be sufficiently impactful for future training. Their lawsuit on December 27—which we learned about by reading The New York Times—came as a surprise and disappointment to us.
Along the way, they had mentioned seeing some regurgitation of their content but repeatedly refused to share any examples, despite our commitment to investigate and fix any issues. We’ve demonstrated how seriously we treat this as a priority, such as in July when we took down a ChatGPT feature immediately after we learned it could reproduce real-time content in unintended ways.
Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third-party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate. Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.
Despite their claims, this misuse is not typical or allowed user activity, and is not a substitute for The New York Times. Regardless, we are continually making our systems more resistant to adversarial attacks to regurgitate training data, and have already made much progress in our recent models.
We regard The New York Times’ lawsuit to be without merit. Still, we are hopeful for a constructive partnership with The New York Times and respect its long history, which includes reporting the first working neural network over 60 years ago and championing First Amendment freedoms.
We look forward to continued collaboration with news organizations, helping elevate their ability to produce quality journalism by realizing the transformative potential of AI.
January 11, 2024 by Jeff Jarvis
ai, artificial intelligence, congress, copyright, journalism
Well, that was surreal. I testified in a hearing about AI and the future of journalism held by the Senate Judiciary Subcommittee on Privacy, Technology, and the Law. Here is my written testimony and here’s the Reader’s Digest version in my opening remarks:
It was a privilege and honor to be invited to air my views on technology and the news. I went in knowing I had a role to play, as the odd man out. The other witnesses were lobbyists for the newspaper/magazine and broadcast industries and the CEO of a major magazine company. The staff knew I would present an alternative perspective. My fellow panelists noted before we sat down — nicely — that they disagreed with my written testimony. Job done. There was little opportunity to disagree in the hearing, for one speaks only when spoken to.
What struck me about the experience is not surprising: They call the internet an echo chamber. But, of course, there’s no greater echo chamber than Congress: lobbyists and legislators agreeing with each other about the laws they write and promote together. That’s what I witnessed in the hearing in a few key areas:
Licensing: The industry people and the politicians all took as gospel the idea that AI companies should have to license and pay for every bit of media content they use.
I disagree. I draw the analogy to what happened when radio started. Newspapers tried everything to keep radio out of news. In the end, to this day, radio rips and reads newspapers, taking in and repurposing information. That’s to the benefit of an informed society.
Why shouldn’t AI have the same right? I ask. Some have objected to my metaphor: Yes, I know, AI is a program and the machine doesn’t read or learn or have rights any more than a broadcast tower can listen and speak and vote. I spoke metaphorically, for if I had instead argued that, say, Google or Meta has a right to read and learn, that would have opened up a whole can of PR worms. The point is obvious, though: If AI creators would be required by law to license *everything* they use, that grants them lesser rights than media — including journalists, who, let’s be clear, read, learn from, and repurpose information from each other and from sources every day.
I think there’s a difference in using content to train a model versus producing output. It’s one matter for large language models to be taught the relationship of, say, the words “White” and “House.” I say that is fair and transformative use. But it’s a fair discussion to separate out questions of proper acquisition and terms of use when an application quotes from copyrighted material from behind a paywall in its output. The magazine executive cleverly conflated training and output, saying *any* use required licensing and payment. I believe that sets a dangerous precedent for news media itself.
If licensing and payment is required for all use of all content, then I say the doctrine of fair use could be eviscerated. The senators argued just the opposite, saying that if fair use is expanded, copyright becomes meaningless. We disagree.
JCPA: The so-called Journalism Competition and Preservation Act is a darling of many members of the committee. Like Canada’s disastrous Bill C-18 and Australia’s corrupt News Media Bargaining Code — which the senators and the lobbyists think are wonderful — the JCPA would allow large news organizations (those that earn more than $100,000 a year, leaving out countless small, local enterprises) to sidestep antitrust and gang together and force platforms to “negotiate” for the right to link to their content. It’s legislated blackmail. I didn’t have the chance to say that. Instead, the lobbyists and legislators all agreed how much they love the bill and can’t wait to try again to pass it.
Section 230: Members of the committee also want to pass legislation to exclude generative AI from the protections of Section 230, which enables public discourse online by protecting platforms from liability for what users say there while also allowing companies to moderate what is said. The chair said no witness in this series of hearings on AI has disagreed. I had the opportunity to say that he has found his first disagreement.
I always worry about attempts to slice away Section 230’s protections like a deli balogna. But more to the point, I tried to explain that there is nuance in deciding where liability should lie. In the beginning of print, printers were held liable — burned, beheaded, and behanded — for what came off their presses; then booksellers were responsible for what they sold; until ultimately authors were held responsible — which, some say, was the birth of the idea of authorship.
When I attended a World Economic Forum AI governance summit, there was much discussion about these questions in relation to AI. Holding the models liable for everything that could be done with them would, in my view, be like blaming the printing press for what is put on and what comes off it. At the event, some said responsibility should lie at the application level. That could be true if, for example, Michael Cohen was misled by Google when it placed Bard next to search, letting him believe it would act like search and giving him bogus case citations instead. I would say that responsiblity generally lies with the user, the person who instructs the program to say something bad or who uses the program’s output without checking it, as Cohen did. There is nuance.
Deep fakery: There was also some discussion of the machine being used to fool people and whether, in the example used, Meta should be held responsible and expected to verify and take down a fake video of someone made with AI — or else be sued. As ever, I caution against legislating official truth.
The most amusing moment in the hearing was when the senator from Tennessee complained that media are liberal and AI is liberal and for proof she said that if one asks ChatGPT to write a poem praising Donald Trump, it will refuse. But it would write a poem praising Joe Biden and she proceeded to read it to me. I said it was bad poetry. (BTW, she’s right: both ChatGPT and Bard won’t sing the praises of Trump but will say nice things about Biden. I’ll leave the discussion about so-called guardrails to another day.)
It was a fascinating experience. I was honored to be included.
For the sake of contrast, in the morning before the hearing, I called Sven Størmer Thaulow, chief data and technology officer for Schibsted, the much-admired (and properly so) news and media company of Scandinavia. Last summer, Thaulow called for Norwegian media companies to contribute their content freely to make a Norwegian-language large language model. “The response,” the company said, “was overwhelmingly positive.” I wanted to hear more.
Thaulow explained that they are examining the opportunities for a native-language LLM in two phases: first research, then commercialization. In the research phase now, working with universities, they want to see whether a native model beats an English-language adaptation, and in their benchmark tests, it does. As a media company, Schibsted has also experimented with using generative AI to allow readers to query its database of gadget reviews in conversation, rather than just searching — something I wish US news organizations would do: Instead of complaining about the technology, use it to explore new opportunities.
Media companies contribute their content to the research. A national organization is making a blanket deal and individual companies are free to opt out. Norway being Norway — sane and smart — 90 percent of its books are already digitized and the project may test whether adding them will improve the model’s performance. If it does, they and government will deal with compensation then.
All of this is before the commercial phase. When that comes, they will have to grapple with fair shares of value.
How much more sensible this approach is to what we see in the US, where technology companies and media companies face off, with Capitol Hill as as their field of play, each side trying to play the refs there. The AI companies, to my mind, rushed their services to market without sufficient research about impact and harm, misleading users (like hapless Michael Cohen) about their capabilities. Media companies rushed their lobbyists to Congress to cash in the political capital earned through journalism to seek protectionism and favors from the politicians their journalists are supposed to cover, independently. Politicians use legislation to curry favor in turn with powerful and rich industries.
Why can’t we be more like Norway?
JAN 5, 2024

There are two ways to think about ChatGPT.
One way is that it’s exceptionally good at doing stuff. We give it an instruction, and, because it’s been trained on an enormous corpus of human language, it can respond in a way that would, if anything, fail the Turing test for being too capable. In an instant, it can write a French sonnet about New Year’s Eve; it can create an argument for why it should be a felony to write songs in C major; it can understand a 700-word blog post from which all the vowels have been removed. For tasks like these—writing emails, creating lesson plans, finding and booking a restaurant for a six-person get-together next Friday in New Orleans—ChatGPT is valuable because of what it can do.
The second way to think it is that it knows things. We ask an LLM like ChatGPT a question; it tell us the answer. It’s valuable because it’s read every encyclopedia and textbook and Reddit post in the world, and can summarize—and in some cases, recreate—what those things say. Though LLMs don’t store this information in a traditional sense—there is no file in GPT-4 that contains the full text of the Declaration of Independence, for example—ChatGPT can still rewrite the entire document. In this way, LLMs aren’t useful because of what they can do, but because of what they know—like who Calvin’s babysitter was, who scored the most points in a WNBA game, and which song starts with the notes “da da da dum.”
This second version of ChatGPT—the one that, above all, knows things—is the version that caused people to declare Google dead, and caused Google to freak out, when OpenAI released it. Whereas Google can find links that might answer your questions, ChatGPT answers them directly. Its appeal was as the ultimate lazyweb.1
If this is the role that LLMs come to occupy—Google 2.0, basically—copyrighted content from books and news publishers is immensely valuable to OpenAI. To replace Google, ChatGPT would need to “know” most of what Google can find—and Google can search the entire internet, including copyrighted websites. Without access to that content, ChatGPT isn’t a better Google; it’s a chatbot for summarizing Wikipedia and Reddit.
If ChatGPT ultimately occupies the first role—a bot that does stuff; an agent—OpenAI doesn’t need copyrighted material. An AI agent would be useful for the same reasons that human agent is useful, and human agents are useful because they can complete complex tasks based on ambiguous instructions. They don’t need to know that much; they need to be able to communicate, reason about problems,2 and look stuff up. And just as a human assistant can be a good assistant without memorizing the script of Star Wars or what was said in the Wall Street Journal yesterday, an LLM can probably be trained to be a useful agent without being trained on copyrighted content. Give it enough high-quality text, from any source, and it can learn to talk as well as any of us.3
Despite the initial panic at Google, I’d be surprised if ChatGPT comes for search. Though that’s partially because LLMs aren’t, on their own, reliable narrators of fact, it’s much more because the economic value of agents that do stuff is potentially far greater than the economic value of a chatbot that knows stuff. “We can help your accountants answer common questions about tax regulations” is a nice pitch, but a fundamentally incremental improvement over Google; “we can create an infinite army of cheap digital labor that can do a lot of the tasks your employees do” is transformative. The frontier of ChatGPT potential isn’t replacing Google, but in using Google4—and in the same way that the cost of manual labor made industrialization all but inevitable, the cost of skilled labor probably makes the agentization of fake email jobs5 all but inevitable too.6
In other words, for the enhanced search engine that OpenAI is today, copyrighted content is necessary. Omniscient oracles need to read the news to be omniscient. But for the autonomous agents they’ll likely become, copyrighted material is simply convenient—news websites, for example, are generally reliable, accurate, well-written, constantly produced in large quantities, and can be collected from relatively centralized sources. But any sufficiently diverse body of text will do.

Art by Clark Miller
By Alexandra Lindsay and Greg Dale
Jan. 5, 2024 9:03 AM PST
Want to feel old? It was more than five years ago that director Jordan Peele teamed up with BuzzFeed to create a viral deepfake video of Barack Obama uttering a series of improbable lines, a clip meant to serve as a public service announcement for the dangers of how technology could be used to manipulate public opinion. “It may sound basic, but how we move forward in the age of information is going to be the difference between whether we survive or become some kind of fucked-up dystopia,” Peele as Obama ventriloquist said.
Now, on the precipice of the 2024 election season, that dystopia is just around the corner, courtesy of artificial intelligence—or so say some of the doomsayers. Last week, Fortune magazine quoted Oren Etzioni, an AI expert and professor emeritus at the University of Washington, who imagines a coming flood of AI-fabricated content showing President Joe Biden being rushed to the hospital or depicting a run on banks. “I am completely terrified,” Etzioni said.
And in June, former Google chair Eric Schmidt told CNBC that AI-generated misinformation in this year’s election was one of the biggest short-term dangers from the technology. “The 2024 elections are going to be a mess because social media is not protecting us from false generated AI,” Schmidt said.
We’re not so sure about that. As technologists with years of experience in the political trenches, we have collectively worked on digital strategy for hundreds of campaigns at the federal, state and local level, in addition to national voter mobilization campaigns.
From our perspective, the impact of AI on this election is likely to be more nuanced than many people predict (including the American public: A recent poll showed that 58% of American adults are concerned about the use of AI increasing the spread of false information during the 2024 presidential election).
Don’t get us wrong—there are real reasons to worry about how rapid advances in AI technology will impact elections. We’ve just emerged from several election cycles in which everyone from Russian agents to presidential candidates themselves spread disinformation widely via social networks. In this new era, foreign adversaries, campaign managers and meme lords will increasingly explore and exploit generative AI’s newfound abilities to make an impact on the political scene. But the AI election apocalypse isn’t likely to happen this year.
Here are our predictions for how AI will (and will not) impact the 2024 U.S. election cycle:
Alexandra Lindsay is co-author of AI Political Pulse, a newsletter dedicated to the politics and policy of artificial intelligence. She is the board chair at Close the Gap California, a nonprofit that recruits women to run for the California State Legislature. She formerly served as Head of Product and Operations at Tech for Campaigns, a nonprofit focused on bringing advanced digital marketing and data science to politics.
Greg Dale is co-author of AI Political Pulse, a newsletter dedicated to the politics and policy of artificial intelligence, and is a marketing and product consultant. He formerly served as CEO of Tech for Campaigns, working with hundreds of Democratic campaigns and independent expenditures at all levels of the ballot
JAN 7, 2024
A few disclaimers before we get into this boondoggle:
I’m not a reporter - I’m a humble tech CFO who thinks this (or whatever this is) is an important issue for the venture backed ecosystem
Also, I’m holding an infant in my left hand, typing with my right, and haven’t shaved since Thursday - not exactly TechCrunch or Axios material
I’m currently a (happy? - tbd) Carta customer
I believe the financial infrastructure they’ve built for private markets revolutionized transparency, trust, and liquidity for startups
I interviewed their CFO Charly Kevers on my podcast Run the Numbers last year, and I think he’s one of the best in the game
OK, let’s ride!
Mostly metrics is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Upgrade to paid
On Friday, Linear Founder, and Carta customer, Karri Saarinen dropped a bombshell.

The one sentence summary would be:
Carta offers two popular products - cap table management and a platform for secondary transactions - and the founder of a company who uses Carta for cap table management thinks Carta is using confidential shareholder info to facilitate trades of his company’s stock, without his approval, on their trading platform.
But first - what are secondary transactions? And when do they occur?
Privately held venture-backed companies often grant their employees stock options as a form of compensation. It also helps attract talent at a lower salary - the company gives you less cash today in exchange for unlimited upside tomorrow.
However, these stock options are typically not liquid until the company goes public, which can take several years. As a result, many pre-IPO companies allow their employees to participate in tender offers, commonly called secondary transactions, which allow them to sell a portion of their vested shares to outside investors.
A tender offer is a liquidity event in which a company, investor, or group of investors propose to buy a fixed number of shares from existing shareholders at a set price. Tender offers can be made for both private companies and public companies, with recent examples OpenAI and Stripe.
It’s important to note that the company does not get any money from a secondary transaction. The balance sheet does not change. While primary dollars are used to fund future operations, M&A, and to hire more talent, there are no new shares created in a secondary transaction, as they merely change hands.
They’re a total pain in the ass to administer. The company merely acts as an approver
Tender offers typically occur in conjunction with a later stage fundraise (Series C and beyond). This is the sweet spot where founders have been at it for long enough to take a little off the table.
Anytime before then is usually a pretty big red flag to investors - it would be suspect if a Series A founder wanted to line their pockets before the company has achieved product market fit.
That’s why all tender offers require board approval.
Here’s the crux of it all - is Carta using their private, confidential, asymmetrical information to shake loose transactions, which they would benefit from, without founder / CEO / CFO / Board approval?
Whew!…..
Connie Loizos @cookie / 7:30 PM PST•January 8, 2024

Image Credits: Carta
Roughly 72 hours after a prominent startup customer complained that Carta was misusing information with which it was entrusted — scaring many of Carta’s tens of thousands of other customers in the process — Carta is exiting the business that landed it in trouble with the customer.
Carta co-founder and CEO Henry Ward posted on Medium tonight that: “Because we have the data, if we are trading secondaries, people will always worry that we are using the data, even if we are not. So we have decided to prioritize trust, and exit the secondary trading business.”
It’s a dramatic turn of events for 14-year-old Carta, which originally focused on cap table management software but began over time to evolve into a “private stock market for companies” to take advantage of the network of companies and investors that already use its platform and into which it has insights. The big idea was to become the transfer agent, brokerage and clearinghouse for all private stock transactions in the world.
While the move made Carta more valuable in the eyes of its venture backers — a company has to scale, after all! — it put the company on dangerous footing after Finnish CEO Karri Saarinen posted on LinkedIn on Friday that Carta was using information about his company’s investor base to try to sell its shares to outside buyers without the company’s knowledge or consent.
Wrote Saarinen, whose project management software company Linear is four years old and a Carta customer: “As a founder it feels kind [of] shitty that Carta, who I trust to manage our cap table, is now doing cold outreach to our angel investors about selling Linear shares to their non disclosed buyers.” Saarinen continued: “They never contacted us (their customer) about starting an order book for Linear shares. The investor they reached out to is a family member whose investment we never published anywhere. We and they never opted in to any kind of secondary sales. Yet Carta Liquidity found their email and knew that they owned Linear shares.”
While Ward apologized publicly to Saarinen, blaming a rogue employee who “violated our internal procedures and went out of bounds reaching out to customers they shouldn’t have,” Saarinen continued the discussion very publicly, saying he had identified numerous other founders whose investors had also been contacted by Carta representatives without their knowledge.
In his post tonight, Ward downplayed the impacts of ending secondary trading on Carta, saying the revenue derived from the practice is minuscule compared with Carta’s other business offerings. According to Ward, Carta’s cap table business “is about $250M/year, fund administration is about $100M, private equity is about $20M, and the secondary trading business is about $3M.” Carta, he added, has done a “decent job of building the cap table business, an ok job at fund admin (but feeling the growing pains), and an abysmal job at the secondary business.”
Further, he continued, having precious customer data that others do not isn’t the superpower that outsiders may think — certainly not if Carta is going to be a good actor in the private company ecosystem.
Ward, widely known to be brusque, strikes an uncharacteristically humble tone in the Medium post, writing, “ALL of my ideas around liquidity — auctions, investor matching, secondary trading, open tender offers, have not worked. I might not be the entrepreneur that can solve this problem.” Indeed, he continued, “Carta might not be the company that can solve this problem. Many people think we are best poised to solve liquidity because we have cap table data. But that same argument is used for data products. People say ‘You have all the data so you should put Pitchbook out of business!’ But it is precisely because we have the data, that we can’t use it. It is our customers’ data, not ours. That’s why in ten years, Carta has never released a data product. I use Pitchbook and TechCrunch when I research a company before I meet the CEO.”
“Having ground truth data is not an advantage if we can’t use it. And it is a disadvantage if people think we use it,” added Ward.
To Carta’s credit, the decision to back out of the secondary sales business came quickly; Carta also seemed to have little choice, with many founders threatening to move their startups’ business elsewhere after the events of this past weekend.
As founder Sim Desai of the financial services startup Hiive wrote on LinkedIn yesterday, [A]side from [Carta’s] apparent breach of trust (possible to fix) and their lack of expertise (hard to fix), Carta faces another impossible conflict between these two business models. Even if they are not using their customers’ confidential information, it is the optics of a potential breach that will stand in the way.”
How the move impacts Carta’s own valuation remains to be seen, as does whether the company sticks to its guns once the startup market rebounds — along with demand for secondary shares.
In the meantime, if you missed the row with Linear that set tongues wagging over the weekend, you can read our earlier coverage here.
On Friday we had an internal policy violation that affected three companies. I’ve been in touch with the founders and I’m appalled we made that mistake and it should never have happened. It is unacceptable and we’ve dealt with the violation on Saturday morning and are continuing the investigation to make sure it never happens again.
Let me share our framework on data privacy and access controls to hopefully address concerns from this weekend. For a deeper dive, I will bucket data privacy into four buckets with different rules that I will cover separately.
1. Public Disclosures: We can only publish aggregate and anonymous data. So we can say things like there are 34K startups on Carta, or the average Series A startup has 25 employees, etc… However, we cannot say Acme Startup has 41 shareholders or the PPS is $13.24. You will see this type of aggregate anonymous information frequently in our data reports.
2. Internal Systems Disclosures: We can use cap table data for onboarding and internal systems development. So for example, we can load cap table data into dashboards for audit, we can write health checks to make sure cap table reports are correct, we can run machine learning algorithms to predict when you need a 409A, etc… We can use cap table data to help us improve the software or customer experience. This also includes things like when support teams access cap tables (through an approval and audit system) or when a customer needs help correcting or updating their cap table. All human access to cap tables is tracked and audited.
3. Sales & Marketing: Lastly, we can market to our customers and users. For example, we can offer new products to help companies with employee compensation, taxes, and expense reporting. Occasionally we have offered products directly to employee shareholders. For example, in the past we have offered stock based loan products to employees of certain companies where employees can access loans to exercise their stock. But when we offer these products to employees we only do it in collaboration with the company. The company has to approve the program for their employees for us to offer it.
4. CartaX: CartaX is a separate product that operates as an opt-in marketplace where investors are invited to enter bids and asks on different companies. At any given time we have about one hundred companies that are in the marketplace. Where CartaX and the cap table business converge is if we match a trade in the marketplace, we go to the company and ask if they will allow it. If the company allows it, we use their cap table to execute the trade. If the company doesn’t allow it, we stop the trade. We do not and will never trade without company consent.
In the case of Linear and two other companies, we had an internal breach of protocol and we contacted someone directly on the cap table. That never should have happened and is absolutely a breach of our privacy protocols. And we have addressed it over the weekend.
The second mistake might be whether we are too close to the cap table business to be helping on liquidity. We started CartaX five years ago to help founders and companies with liquidity and it has mostly been a net positive for founders, employees, and shareholders. But even if we do everything perfectly and make zero mistakes, perhaps just the appearance of being in the liquidity business makes us seem compromised. Everything we do must be grounded in trust and if being in the liquidity business compromises that trust, perhaps we need to reevaluate that offering.
I will think about this and come back with more thoughts in the coming months. If you have a perspective on whether Carta should be helping companies with liquidity, please reach out to me. I’d love to hear them.
I’m sorry for scaring everybody about this. After ten years of managing cap tables across 40,000 startups, I promise we aren’t compromising anyone’s data. We won’t be here if you don’t trust us. Trust, transparency, and integrity is our most important currency. If you would like to chat with me more one-on-one, please email me at henry.ward@carta.com and we can set up a zoom.
January 12, 2024
Ever since Garry Tan came on as Y Combinator CEO last year, there have been changes.
Last March, Tan cut its late-stage investing and laid off 17 investors, and he shrank the size of YC’s batches. Less known is that the startup accelerator also moved its headquarters. After spending 17 years operating out of Mountain View, Y Combinator moved its operations up north to the Dogpatch neighborhood in San Francisco and into Pier 70, according to city records reviewed by Fortune and confirmed by Y Combinator.
In an interview, Y Combinator CEO Garry Tan said it was important to him that YC be as close to the cutting edge of artificial intelligence innovation as possible (OpenAI and Anthropic and “a lot of the top talent” in artificial intelligence are in San Francisco versus Silicon Valley, he says). Not to mention—the majority of Y Combinator partners, including himself, live in San Francisco, he adds.
These days, “you sort of have to be in San Francisco,” Tan says, noting the importance of the accidental run-ins that happen when people are out and about. “Chances are the people around you are thinking about and talking about technology, and especially A.I. That’s really special,” Tan says. He also pointed out that YC data shows that startups that were built in San Francisco were more likely to succeed than their peers.
The move from Mountain View, which happened in the spring, right before YC welcomed its summer batch of startups, is also part of Y Combinator’s broader effort to bring its program back to being fully in-person post-COVID. Y Combinator started doing so in Summer 2022, though its demo days have still been remote. (YC says its next Demo Day will be partially in-person at Pier 70, though presentations will still be online)
Tan doesn’t just want founders back in the Bay Area: He wants them very close by. The Y Combinator CEO says the accelerator is highly encouraging founders to get places in the Dogpatch or Potrero Hill, or at least nearby. “I think it’s actually going to be really good that people know to be in walking distance of each other, and I think the connections between the founders are just going to be that much stronger,” he said, noting that he hopes Y Combinator
CASEY NEWTON, JAN 8, 2024

Substack is removing some publications that express support for Nazis, the company said today. The company said this did not represent a reversal of its previous stance, but rather the result of reconsidering how it interprets its existing policies.
As part of the move, the company is also terminating the accounts of several publications that endorse Nazi ideology and that Platformer flagged to the company for review last week.
The company will not change the text of its content policy, it says, and its new policy interpretation will not include proactively removing content related to neo-Nazis and far-right extremism. But Substack will continue to remove any material that includes “credible threats of physical harm,” it said.
In a statement, Substack’s co-founders told Platformer:
If and when we become aware of other content that violates our guidelines, we will take appropriate action.
Relatedly, we’ve heard your feedback about Substack’s content moderation approach, and we understand your concerns and those of some other writers on the platform. We sincerely regret how this controversy has affected writers on Substack.
We appreciate the input from everyone. Writers are the backbone of Substack and we take this feedback very seriously. We are actively working on more reporting tools that can be used to flag content that potentially violates our guidelines, and we will continue working on tools for user moderation so Substack users can set and refine the terms of their own experience on the platform.
Substack’s statement comes after weeks of controversy related to the company’s mostly laissez-faire approach to content moderation.
In November, Jonathan M. Katz published an article in The Atlantic titled “Substack Has a Nazi Problem.” In it, he reported that he had identified at least 16 newsletters that depicted overt Nazi symbols, and dozens more devoted to far-right extremism.
Last month, 247 Substack writers issued an open letter asking the company to clarify its policies. The company responded on December 21, when Substack co-founder published a blog post arguing that “censorship” of Nazi publications would only make extremism worse.
McKenzie also wrote that “we don’t like Nazis either” and said Substack wished “no-one held those views.” But “we don't think that censorship (including through demonetizing publications) makes the problem go away,” he wrote. “In fact, it makes it worse. We believe that supporting individual rights and civil liberties while subjecting ideas to open discourse is the best way to strip bad ideas of their power.”
The statement seemed to be at odds with Substack’s published content guidelines, which state that “Substack cannot be used to publish content or fund initiatives that incite violence based on protected classes.”
In its aftermath, several publications left the platform. Others, including Platformer, said they would leave if the company did not remove pro-Nazi publications.
Meanwhile, more than 100 other Substack writers, including prominent names like Bari Weiss and Richard Dawkins, signed a post from writer Elle Griffin calling on Substack to continue with its mostly hands-off approach to platform-level moderation.
From its inception, McKenzie and Substack co-founder Chris Best have touted freedom of speech as one of Substack’s core virtues. As a result, the platform has been embraced by fringe thinkers, who have built large businesses while promoting anti-vaccine pseudo-science, Covid conspiracy theories and other material that is generally restricted on mainstream social networks.
Substack has defended its approach by arguing that it is built differently from social networks, which optimize for engagement rather than subscription revenue. The company says it employs a “decentralized” approach to moderation that allows individual readers to decide which writers they want to subscribe to; and lets writers determine which comments they will allow and which blogs they will recommend.
(Incidentally, this approach means that you can’t currently report comments directly to Substack: only writers receive your reports. Platformer has reviewed several cases of violent material and death threats in Substack comments.)
At the same time, over the past couple years Substack has come to more closely resemble the social networks it often criticizes. Each week, Substack sends users a personalized, algorithmically ranked digest of posts from writers they don’t yet follow — a feature that can help fringe publications build larger audiences and make more money than they would otherwise.
And last year Substack launched Notes, a text-based social feed similar to Twitter that also surfaces personalized content in a ranked feed. Notes can also give heightened visibility and free promotion to extremists.
The question now is whether taking action against some pro-Nazi accounts will shift the perception that Substack is a home for the most extreme ideologies, and prevent an exodus among writers who prefer more aggressive content moderation.
In recent weeks, Platformer has worked with other journalists and extremism researchers in an effort to understand the scope of far-right content on the platform. We’ve now reviewed dozens of active, monetized publications that advance violent ideologies, including anti-Semitism and the great replacement theory.
Substack has argued that extremist publications represent only a small fraction of newsletters on the platform, and as far as we can tell this is true. At the same time, the site’s recommendations and social networking infrastructure is designed to enable individual publications to grow quickly. And the company’s outspoken embrace of fringe viewpoints all but ensures that the number of extremist publications on the platform will grow.
The company is now in a difficult position. Having branded itself as a bastion of free speech, any changes to its content policy risks driving away writers who chose the platform in part for its rejection of aggressive content moderation. At the same time, other publications — Platformer included — have lost scores of paying customers who do not want to contribute to a platform that they see as advancing the cause of extremism.
In coming days, explicitly Nazi publications on Substack are slated to disappear. But the greater divide within its user base over content moderation will remain. The next time the company has a content moderation controversy — and it will — expect these tensions to surface again.
Substack’s removal of Nazi publications resolves the primary concern we identified here last week. At the same time, as noted above, this issue has raised concerns that go beyond the small group of publications that violate the company’s existing policy guidelines.
JAN 12, 2024
At the end of November, an article by Jonathan Katz appeared at The Atlantic, with the foreboding title “Substack has a Nazi problem”. (It seems more portentous with the original Random Headline Capitals, but you’re at a British English publication now, so suck up the lowercase.) Katz began:
The newsletter-hosting site Substack advertises itself as the last, best hope for civility on the internet—and aspires to a bigger role in politics in 2024. But just beneath the surface, the platform has become a home and propagator of white supremacy and anti-Semitism. Substack has not only been hosting writers who post overtly Nazi rhetoric on the platform; it profits from many of them.
“Profits from many of them.” This is quite a big claim, and you need to read pretty closely to see whether Katz manages to stand it up.
An informal search of the Substack website and of extremist Telegram channels that circulate Substack posts turns up scores of white-supremacist, neo-Confederate, and explicitly Nazi newsletters on Substack—many of them apparently started in the past year. These are, to be sure, a tiny fraction of the newsletters on a site that had more than 17,000 paid writers as of March…
…More (Charles has asked me to cut early so that you click through for the rest).
CASEY NEWTON, JAN 11, 2024

After much consideration, we have decided to move Platformer off of Substack. Over the next few days, the publication will migrate to a new website powered by the nonprofit, open-source publishing platform Ghost. If you already subscribe to Platformer and wish to continue receiving it, you don’t need to do anything: your account will be ported over to the new platform.
If all goes well, following the Martin Luther King Jr. holiday on Monday, you’ll receive the Tuesday edition of Platformer as normal. If you have any issues with your subscription after that, please let us know.
Today let’s talk about how we came to this decision, the debate over how platforms should moderate content, and why we think we’re better off elsewhere.
I.
When I launched Platformer on Substack in 2020, it was not in the belief that we would be here forever. Tech platforms come and go; in the meantime, they can also change in ways that make staying there impossible for the creators that rely on them. For this reason, I almost launched Platformer on a custom-built stack of services centered on WordPress, the way my inspiration Ben Thompson had done for Stratechery.
But Substack had some compelling advantages of its own. It was impressively fast and easy to set up. It paid to design Platformer’s logo. It offered me a year of healthcare subsidies, and ongoing legal support.
I also felt a personal connection to Substack’s co-founders, who believed that Platformer would succeed even before it had a name. They convinced me that I could thrive on their platform, and offered me a welcome boost in confidence as I considered leaving the best job I ever had to strike out on my own.
In the three years since, Substack has been a mostly happy home. Platformer has grown tremendously over that time, from around 24,000 free subscribers to more than 170,000 today. Our paid subscribers have allowed me to create new jobs in journalism. I’m proud of the work we do here.
Over that same period, Substack has faced occasional controversies over its laissez-faire approach to content moderation. The platform hosts a wide range of material I find distasteful and offensive. But for a time, the distribution of that material was limited to those who had signed up to receive it. In that respect, I did not view the decision to host Platformer on Substack as being substantially different from hosting it on, for example, GoDaddy.
But as I wrote earlier this week, Substack’s aspirations now go far beyond web hosting. It touts the value of its network of publications as a primary reason to use its product, and has built several tools to promote that network. It encourages writers to recommend other Substack publications. It sends out a weekly digest of publications for readers to consider subscribing to. And last year it launched a Twitter-like social network called Notes that highlights posts from around the network, regardless of whether you follow those writers or not.
Not all of you use these features. Some of you might not have seen them. But I can speak to their effectiveness: In 2023, we added more than 70,000 free subscribers. While I would love to credit that growth exclusively to our journalism and analysis, I believe we have seen firsthand how quickly and aggressively tools like these can grow a publication.
And if Substack can grow a publication like ours that quickly, it can grow other kinds of publications, too.
II.
In November, when Jonathan M. Katz published his article in The Atlantic about Nazis using Substack, it did not strike me as cause to immediately leave Substack. All platforms host problematic and harmful material; I assumed Substack would remove praise for Nazis under its existing policy that “Substack cannot be used to publish content or fund initiatives that incite violence based on protected classes.”
And so, after reading the open letter from 247 writers on the platform calling for clarity on the issue, I waited for a response.
The response, from Substack co-founder Hamish McKenzie, arrived on December 21. It stated that Substack would remove accounts if they made credible threats of violence but otherwise would not intervene. “We don't think that censorship (including through demonetizing publications) makes the problem go away — in fact, it makes it worse,” he wrote. “We believe that supporting individual rights and civil liberties while subjecting ideas to open discourse is the best way to strip bad ideas of their power.”
This was the moment where I started to think Platformer would need to leave Substack. I’m not aware of any major US consumer internet platform that does not explicitly ban praise for Nazi hate speech, much less one that welcomes them to set up shop and start selling subscriptions.
But suddenly, here we were.
I didn’t want to leave Substack without first getting my own sense of the problem. I reached out to journalists and experts in hate speech and asked them to share their own lists of Substack publications that, in their view, advanced extremist ideologies. With my colleagues Zoë Schiffer and Lindsey Choo, I reviewed them all and attempted to categorize them by size, ideology, and other characteristics.
In the end, we found seven that conveyed explicit support for 1930s German Nazis and called for violence against Jews, among other groups. Substack removed one before we sent it to them. The others we sent to the company in a spirit of inquiry: will you remove these clear-cut examples of pro-Nazi speech? The answer to that question was essential to helping us understand whether we could stay.
It was not, however, a comprehensive review of hate speech on the platform. And to my profound disappointment, before the company even acted on what we sent them, Substack shared the scope of our findings with another, friendlier publication on the platform, along with the information that these publications collectively had few subscribers and were not making money. (It later apologized to me for doing this.)
The point of this leak, I believe, was to make the entire discussion about hate speech on Nazis on Substack appear to be laughably small: a mountain made out of a molehill by bedwetting liberals.
To us, the six publications we had submitted had only ever been a question: would Substack, in the most clear-cut of all speech cases, do the bare minimum?
In the end, it did, in five out of six cases. As all of this unfolded, I spoke twice with Substack’s co-founders. And while they asked that those conversations be off the record, my understanding from our conversations — based on material they had shared with me in writing — was that in the future they would regard explicitly Nazi and pro-Holocaust material to be a violation of their existing policies.
But on Tuesday, when I wrote my story about the company’s decision to remove five publications, that language was missing from their statement. Instead, the company framed the entire discussion as having been about the handful of publications I had sent them for review.
I attempted to write a straightforward news story about all this, and wound up infuriating many readers. On the right, I faced criticism for making a fuss out of Substack hosting a handful of small Nazi publications. On the left, I faced even louder criticism for (in their view) appearing to celebrate and validate Substack’s removal of those same publications. (I wrote Tuesday that “Substack’s removal of Nazi publications resolves the primary concern we identified here last week.” I regret using that language. What I should have said was “Substack did the basic thing we asked it to,” and then emphasized that it did not address our larger concerns. Which I did go on to say, though not with the force that in hindsight I wish I had.)
I’m happy to take my lumps here. I just want to say again that to me, this was never about the fate of a few publications: it was about whether Substack would publicly commit to proactively removing pro-Nazi material. Up to the moment I published on Tuesday, I believed that the company planned to do this. But I no longer do.
From there, our next move seemed clear. But first I wanted to consult our readers, whose advice and support I have been so lucky to rely on over these past few years. Asking readers for their thoughts proved to be surprisingly controversial, especially in the Sidechannel Discord, where some of you wondered whether I was seeking a fig leaf of approval that we could use to justify staying here. But Platformer has as its readers some of the world’s smartest minds in content moderation and trust and safety — I sincerely wanted to get your thoughts before making a final decision.
Over the next 48 hours, the Platformer community raised a variety of sensible objections to how Substack had handled this issue. You pointed out that Substack had not changed its policy; that it did not commit explicitly to removing pro-Nazi material; that it seemed to be asking its own publications to serve as permanent volunteer moderators; and that in the meantime all of the hate speech on the platform remains eligible for promotion in Notes, its weekly email digest, and other algorithmically ranked surfaces.
In emails, comments, Substack Notes and callouts on social media, you’ve made your view clear: Platformer should leave Substack. We waited a day to announce our move as we finalized plans with Ghost and began our migration. But today we can say clearly that we agree with you.
Substack’s tools are designed to help publications grow quickly and make lots of money — money that is shared with Substack. That design demands responsible thinking about who will be promoted, and how.
The company’s defense boils down to the fact that nothing that bad has happened yet. But we have seen this movie before, from Alex Jones to anti-vaxxers to QAnon, and will not remain to watch it play out again.
III. Frequently asked questions about Substack and free speech
We’re still only talking about six newsletters. Aren’t you overreacting?
To be clear, there are a lot more than six bad publications on Substack: our analysis found dozens of far-right publications advocating for the great replacement theory and other violent ideologies.
…More
JAN 12, 2024

A fairly contrived effort to endlessly link the word Substack to the word Nazi has had some moderate success, unfortunately. Or at least enough success to have sparked an open letter republished on many individual Substacks calling on Substack to get rid of Nazis, a counter–open letter calling on it to maintain its liberal content-moderation standards, a statement from Substack co-founders Hamish McKenzie, Chris Best, and Jairaj Sethi explicitly stating that they do not plan to ban Nazis from the platform, a bunch of Substackers responding by leaving or threatening to leave if Substack doesn’t moderate the content it hosts more aggressively, and a spate of news coverage of all of the above.
This is all pretty odd given that Substack’s content guidelines are conspicuously written to hew quite closely to the First Amendment on matters of alleged hate speech and have been in place for more than two years. Plus, the site’s founders have been very consistent about their lack of interest in adopting a more conservative approach to speech on Substack, even when sticking to their guns has led to bad PR. Since it’s clear that on Substack, almost anything goes that doesn’t involve a credible threat of violence, no one should be surprised that unsavory types can set up shop here.1
Earlier this week I critiqued the reporting of Casey Newton, arguing that his work on the controversy for his publication Platformer was shoddy and misleading, and seemed designed to obscure key information from his readers. At the end of the day, after what Newton described as a rather comprehensive search for extremist content on Substack, he and his team sent the company a grand total of six publications they believed violated its standards, and Substack banished five of them while declining to actually change its written policies. The publications in question, Substack told Newton in a statement, had 100 active readers between them and none had paid subscriptions turned on. Newton quoted selectively from Substack’s response, in a manner that excluded the number of Substacks he had reported, their moribund nature, and their lack of paid readership. When I asked Newton why he had left out this information, his answer — that revealing how many Nazi publications his team reported to Substack would put him and his team at risk of harassment at the hands of the Nazi authors in question — didn’t really make sense, and he wouldn’t elaborate on it. (Newton has since announced Platformer is leaving Substack.)
In this post I’d like to focus mostly on the article that started this whole affair: Jonathan M. Katz’s late November piece in The Atlantic, “Substack Has a Nazi Problem.” It turns out Katz almost entirely fabricated what is perhaps his most damning anecdote about Substack’s approach to extremism. After I lay out, in detail, how he did this, I’ll explain how The Atlantic (and Katz) responded to my critique. Then I’ll close with a discussion of the difficulty of developing consistent content moderation guidelines, drawing on several Substack competitors’ deeply troubled attempts to do so.
ANALYSIS BY CHERYL KNIGHT
In an exclusive analysis by theCUBE Research, industry experts assess the breaking news that Hewlett Packard Enterprise Co. has confirmed its acquisition of Juniper Networks Inc. for approximately $14 billion.
The acquisition — HP’s largest since its Autonomy Ltd. purchase in 2011 prior to its split into HPE and HP Inc. — is poised to double the size of its networking business, making it a major contributor to HPE’s annual operating income. The deal is strategically aligned with HPE’s ambitions in the networking sector, leveraging Juniper’s advancements in artificial intelligence, particularly through its Mist AI service, which enhances wireless access and network security, positioning HPE favorably in the burgeoning AI and cloud-native market spaces.
“From HPE’s standpoint, the marriage of HPE and Juniper makes a lot of sense,” said Dave Vellante (pictured, second from left), theCUBE Research analyst. “HPE’s got silicon chops going back to pre-split. If you think about HPE’s as-a-service portfolio, they have compute down with their service business, they’ve got Aruba, and now they’re adding in Juniper. They’ve got storage.”
Vellante and industry analyst John Furrier (left) spoke with Zeus Kerravala (middle), founder and principal analyst at ZK Research; Jake Kaldenbaugh (second from right), managing partner at CloudStrategies; and Steve Mullaney (right), former chief executive officer of Aviatrix, about the benefits and potential drawbacks of this major acquisition.
Is the acquisition a pure consolidation play to lower costs, raise revenue and increase industry leverage or an attempt to combine portfolios for true innovation opportunities?
HPE’s acquisition of Juniper Networks is less about pioneering uncharted technological territories and more about strengthening its current market position, according to Kaldenbaugh. It’s a consolidation play, aiming to unify and leverage the strengths of both companies, he added.
“This deal feels much more like a Broadcom-VMware play than it does somebody who’s reaching for the future,” he said.
The panelists also agreed that the pending deal is seen as a strategic effort to enhance HPE’s edge-to-cloud capabilities and compete more effectively against industry giants such as Cisco Systems Inc.
“If you look at the market and you look at the technology synergies between HPE and Juniper in this deal, particularly how their combined portfolios can put innovation back at the center of HPE’s strategy — specifically in AI and cloud-native environments — this acquisition is expected to double HPE’s networking business, creating a formidable position play against Cisco and others well,” Furrier said.
While the panel highlighted the potential for innovation, especially in AI and security assets, there’s also an acknowledgment that the acquisition could be viewed as a consolidation play, given the current trends in the industry. While the deal strengthens HPE’s portfolio, it’s crucial for HPE to not just focus on physical networking hardware, but also pivot toward software and cloud-based solutions, aligning with the industry’s shift from traditional hardware-centric approaches, according to Mullaney.
“The problem with what they’re doing it’s very much focused still on the physical world of networking, boxes,” he said. “It shifted from boxes to software and cloud about five years ago. They won’t have the growth until they actually start going after where the growth is, which is in the cloud. The growth will come when they really, truly understand that this is a cloud-centric, cloud-first kind of world.”
Also important in assessing the acquisition is the critical aspect of scale in the networking industry, according to Kerravala. The merger could forge a larger entity, better positioned to compete with dominant players such as Cisco. This perspective considers the historical challenges both HPE and Juniper have faced in growing their market share.
With stocks of both companies showing lateral movement, it’s clear that competing in a market led by a heavyweight such as Cisco is no small feat. In networking, where size significantly impacts a company’s ability to serve large, global clients, this merger could be a strategic move to create a more formidable competitor.
“If you combine the two companies together, you get in theory a much bigger company that can compete with Cisco,” noted Kerravala, encapsulating the potential of the deal to transform the competitive dynamics in the networking sector. This analysis underscores the merger’s rationale as a bid not just for growth, but for relevance and competitive parity in a challenging industry.
Here’s theCUBE’s complete video analysis:
January 4, 2024

A slower final quarter ended a lackluster year for global startup funding as venture capital investors continued to hold back in 2023, Crunchbase data shows.
In all, 2023 is on pace to be the lowest for venture funding since 2018. Global startup investment in 2023 reached $285 billion — marking a 38% decline year over year, down from the $462 billion invested in 2022.

Cutbacks were deep across all funding stages globally. Early-stage funding in 2023 was down more than 40% year over year, late stage by 37%, and seed just over 30%.
It’s worth keeping some perspective, though: Overall funding in 2023 was down by less than 20% when compared to the pre-pandemic years of 2018 to 2020.
Two years into the slowdown, the venture markets are still reckoning with the funding boom of 2021. The fall in tech stocks and a slowdown of the IPO market since the beginning of 2022 has tempered the industry. Valuations set in 2021 did not hold up in 2023, as promising companies raised flat and down rounds.
Startups last year navigated a tough funding environment, tightened their belts and focused on unit economics. Layoffs across tech deepened in 2023.
Investors deployed capital more sparingly, with a higher bar at each stage.
“You can get higher ownership as a fund than you could in 2021,” said Michael Cardamone of New York-based seed investor Forum Ventures. The current funding environment favors funds and is more difficult for startup founders, he said.
The U.S. — the largest startup investment market with about half of all venture funding — mirrored global trends. Funding to U.S.-based startups in 2023 totaled $138 billion, down by 37% year over year.
While most industries were down year over year, AI was the largest sector to show an increase. Global funding to AI startups reached close to $50 billion last year, up 9% from the $45.8 billion invested in 2022. The largest fundings in 2023 went to foundation model companies OpenAI, Anthropic and Inflection AI, which collectively raised $18 billion in 2023.
Semiconductors and battery tech also all saw increased investment in 2023.
Two industries, however, stood out as performing better than broader market declines. Manufacturing and cleantech startups were down in 2023 year over year, but by less than 20%.
Web3, which experienced a runup in 2021 and into 2022, fell 73% year over year in 2023, from $28 billion to $7.6 billion.
Other leading sectors that were down year over year include financial services (down over 50%), e-commerce and shopping (down 60%), and media and entertainment (down 64%).
Q4 marks the lowest quarter for global venture funding in 2023. Quarterly funding totaled $58 billion, down 24% quarter over quarter and 25% year over year.


Seed funding totaled $7 billion in Q4, down just over 20% year over year from $9 billion.
Despite the cutbacks at seed, it is seen to be the most robust funding stage with new companies funded. And as it became more challenging to raise a Series A round, companies were more likely to raise follow-on seed funding.

Early-stage funding declined the most in 2023 compared to other funding stages.
In the fourth quarter, early-stage funding totaled close to $23 billion, down a tad quarter over quarter, and down 32% year over year from $33 billion.

Late-stage funding in the fourth quarter was 25% of the volume of the peak in Q4 2021.
Fourth-quarter funding reached $28.6 billion, down close to 20% year over year.
Funding at this stage fluctuated throughout 2023 as large fundings went to AI, semiconductor, battery and clean energy companies.

With the increased number of companies funded in recent years, and the tightening funding markets, we expect the layoffs of 2023 will give way to more companies closing in 2024.
The venture markets got more disciplined in 2023. Without a bump in exits, 2024 will continue to be tough for founders in a funders market.
A reminder for new readers. That Was The Week collects the best writing on critical issues in tech, startups, and venture capital. I selected the articles because they are of interest. The selections often include things I entirely disagree with. But they express common opinions, or they provoke me to think. The articles are only snippets. Click on the headline to go to the original. I express my point of view in the editorial and the weekly video below.

Thanks To This Week’s Contributors: @Furedibyte, @kwharrison13, @Jason, @dweisburd, @altcap, @bgurley, @AndreRetterath, @benedictevans, @apple, @BazeleyMikiko, @OpenAI, @jeffjarvis, @bennstancil, , @gregdale, @mslopatto, @cjgustafson222, @cookie, @karrisaarinen, @henrysward, @jessicakmathews, @CaseyNewton, @charlesarthur, @jessesingal, @siliconangle, @geneteare, @crunchbasenews
What is Truth? Quite a philosophical question for a weekly newsletter about technology and venture capital. But this week’s content seems to require the question.
Substack is accused of hosting Nazi content; Casey Newton’s Platformer announced first that it is staying on the platform, then said it is leaving. Much of the reason rests on reader feedback in between the two decisions. Jonathan Katz, writing in the Atlantic, has stated:
The newsletter-hosting site Substack advertises itself as the last, best hope for civility on the internet—and aspires to a bigger role in politics in 2024. But just beneath the surface, the platform has become a home and propagator of white supremacy and anti-Semitism. Substack has not only been hosting writers who post overtly Nazi rhetoric on the platform; it profits from many of them.
Below, you will find accusations that this characterization is, at best, exaggerated and, at worst, intentionally malicious.
It turns out Katz almost entirely fabricated what is perhaps his most damning anecdote about Substack’s approach to extremism. After I lay out, in detail, how he did this, I’ll explain how The Atlantic (and Katz) responded to my critique. Then I’ll close with a discussion of the difficulty of developing consistent content moderation guidelines, drawing on several Substack competitors’ deeply troubled attempts to do so.
Similar events unfolded concerning Carta, the company that hosts the share tables of many tech startups.
The CEO of one such company, Linear, posted a long X piece outlining that Carta had approached an investor in the company to assess their readiness to sell their shares in a secondary transaction.
First of all, I posted this publicly because I suspected there is a broader systematic issue with Carta. A company that is dealing with extreme trust, corporate cap table and other private matters, should take safe guarding confidential information seriously .
Since then I’ve learned from multiple companies that this has been going on for months or even years where investors or employees of private companies are solicited by Carta employees to put their shares on sale. These people haven’t opted in to this and companies haven’t approved these sales.
If Carta and Carta Marketplace employees have free access to company information and cap table information in order to generate secondary sales (which companies often don’t want) it all starts to seem rotten.
Carta responded quickly and asserted that the incident was a one-off impacting three companies and was instigated by a rogue employee going outside their protocols. CEO Henry Ward then posted that Carta would exit the secondary sales market due to trust issues. There was a lot of back and forth between those two endpoints. During one of them, Ward accused the CEO of Linear:
Henry Ward, Carta’s CEO, acknowledged the mistake but questioned Saarinen’s continued use of Carta despite his public criticism. “But despite feeling so upset about our mistaken email that you are calling for the end of Carta, and eliminate 2,000 jobs and strand 40,000 customers, you didn’t ask to cancel your contract with Carta,” tweeted Ward. “It seems you are still planning to stay with us despite all of the public bashing? I don’t understand? Was this just to firebomb us for your personal twitter and LinkedIn exposure?”
The truth in all of this is hard to find. But Ward questioning Saarinen’s motives came close to suggesting he was not wronged, even though he clearly was.
Normal debates in social media are often carried out with accusations of “fake news” and counter-accusations of lying. “You’re lying” has become a sufficient response to a view that one disagrees with. It goes alongside the weaponization of words as a bullying tactic. “Nazi” is now routinely used to describe almost any conservative view. Holocaust is used to describe any violence toward any group of people when it clearly means the extermination of an entire group deliberately and for no other reason. Intolerance and language intertwine to create a culture where discussion and difference cannot exist.
The real world is not as “clean” as we would all like. The desire to cleanse it of views we find intolerable is a very illiberal instinct. And “lying” or “fake news” are simply cleansing mechanisms, not real discussion or debate.
Frank Furedi has this week’s essay of the week for opening up a discussion of these issues. He examines the Davos World Economic Forum publication - The Global Risk Report 2024.
The Global Risk Report 2024 more or less claims that misinformation and disinformation constitute the greatest risk facing society in the period ahead.
And quotes from it:
..emerging as the most severe global risk anticipated over the next two years, foreign and domestic actors alike will leverage Misinformation and disinformation to further widen societal and political divides
Furedi concludes:
At the heart of the discussion about Fake News is the questions of who gets to decide what is false and what is real. And that is a roundabout way of saying who gets to decide what is true. One of the most important ways that a society comes to a consensus about what is true is through argument and debate and political struggle. Democratic elections are not just about choosing specific policies but also about deciding whose view of the world should prevail. In recent elections the hitherto hegemonic status of the globalist worldview has come under challenge by newly emerging populist and anti-status quo movements, many of whom have gone from strength to strength. It is the fear of the outcome of the many impending elections that have motivated the authors of the WEF report to brand misinformation as a global existential threat.
His point about the need to debate to decide choices and outcomes is clearly right. Shouting down an opponent or accusing them of lying is a bullying tactic of a poor debater. But more importantly, it threatens democracy if speech can be bullied.
The attempt to force Substack to close down publications that are within its terms of service is a form of bullying. To the founder’s credit, they are not prepared to abandon their belief in reason and open discourse.
They may not like the association based on their mutual history, but Elon Musk’s attitude to speech on X is similar. And also worthy of support.
As adults, we can all read and make up our minds about what we believe without needing discourse to be cleansed of views outside a narrow range.
There will be no video this week as Andrew is traveling. Back in full next week. Enjoy. There is a lot to make you think in the works below.
FRANK FUREDI, JAN 13, 2024
Roots & Wings with Frank Furedi
The World Economic Forum Really Thinks That The Biggest Global Risk Is Democracy
a year ago · 32 likes · Frank Furedi

The World Economic Forum gathers its troops together at Davos this coming week. It appears that its main concern is the outcome of the numerous elections that are coming up during the next couple of year. That is why they regard their lack of control over democratic decision making as the biggest threat facing their world.
If you want to understand why the globalist elite is so out of touch with the problems and challenges facing people and society than you must peruse through the World Economic Forum’s Global Risk Report 2024. Written for the those in attendance at this year’s WEF’s conference at Davos, the report outlines what the globalist oligarchs perceive to be the main risks confronting them. The Report is based on a Global Risks Perception Survey, which presumes to communicate the views of the experts and stakeholders who subscribe to the globalist consensus of the WEF.
This Report serves as a paradigm of what I diagnosed elsewhere as democracy panic. The current wave of Democracy Panic is spread by people who believe that the ‘demos’ are influenced by prejudice and fake news.
The tone of the Report is grim. It warns that the ‘eruption of active hostilities in multiple regions is contributing to an unstable global order characterized by polarizing narratives, eroding trust and insecurity’. Throughout the report references are made to the scourge of ‘polarising narratives’. The WEF’s anxiety regarding ‘polarising narratives’ is not surprising since what this term refers to is the emergence of anti-elitist and counter-cultural ideals that challenge the outlook of a complacent ruling elite. Coupled with the obsession with ‘polarising narratives’ is a near hysterical concern with the risk represented by ‘misinformation and disinformation’.
The Global Risk Report 2024 more or less claims that misinformation and disinformation constitute the greatest risk facing society in the period ahead. It notes that:
‘emerging as the most severe global risk anticipated over the next two years, foreign and domestic actors alike will leverage Misinformation and disinformation to further widen societal and political divides’!
The report explicitly connects the alleged risks posed by fake news to its concern with the outcome of the numerous elections that will be held in the next two years. It states that;
As close to three billion people are expected to head to the electoral polls across several economies – including Bangladesh, India, Indonesia, Mexico, Pakistan, the United Kingdom and the United States – over the next two years, the widespread use of misinformation and disinformation, and tools to disseminate it, may undermine the legitimacy of newly elected governments. Resulting unrest could range from violent protests and hate crimes to civil confrontation and terrorism.
As Carolina Klint, Europe chief commercial officer for consultants Marsh McLennan, which helped produce the Report’s findings stated ‘'The potential impact’ of fake news ‘on elections worldwide over the next two years is significant and that could lead to elected governments' legitimacy being put in question’.
Since competing claims about what is and what is not true have plagued elections since the beginning of modern times it is far from clear as to why they should constitute such a dangerous global threat to humanity.
The report acknowledges that ‘misinformation and disinformation have long histories’ but asserts that the ‘the erosion of political checks and balances, and the growing sophistication of tools that spread and control information, could ‘amplify the efficacy of domestic disinformation over the next two years’. For the authors of the report the development of new technologies coinciding with the erosion of trust in the political establishment and its institutions creates the conditions where fake news represents an unprecedented threat to global stability.
There is little doubt that new technologies such as AI generated content can provide new opportunities for confusing and misleading the public with false information. But virtually every new form of communication technology since the invention of the printing press has possessed the potential to promote lies and distort reality. Coping with this threat has been integral to the political and socio-economic challenges facing society throughout modern times. From small self-serving misinformation to the Big Lie society has always been confronted with the challenge to uphold the truth. That this normal problem of modern society is elevated to the task of an existential threat is not so much driven by the problem posed by the new technologies of misinformation but by concern with the uncertainty posed by democratic decision making.

At the heart of the discussion about Fake News is the questions of who gets to decide what is false and what is real. And that is a roundabout way of saying who gets to decide what is true. One of the most important ways that a society comes to a consensus about what is true is through argument and debate and political struggle. Democratic elections are not just about choosing specific policies but also about deciding whose view of the world should prevail. In recent elections the hitherto hegemonic status of the globalist worldview has come under challenge by newly emerging populist and anti-status quo movements, many of whom have gone from strength to strength. It is the fear of the outcome of the many impending elections that have motivated the authors of the WEF report to brand misinformation as a global existential threat.
When the report raises the alarm about the possibility of fake news hijacking elections in 2024 and 2025 what it is really saying is that the wrong kind of people and parties could prevail. It is worth noting that the emergence of concern with Fake News coincided with the failure of the forces of globalism to manage the June 2016 Brexit Referendum in their favour. The election of Donald Trump later that year was frequently ascribed to the role played by fake news social media platforms and other nefarious technologically created false propaganda.
2016 marked a turning point in the fortunes of the political and cultural elites who subscribe to WEF’s cosmopolitan orientation. Ever since the Brexit Referendum in June 2016, the proceedings at the World Economic Forum have been haunted by the challenge that this event posed for the globalist outlook of those in attendance. From their perspective Brexit symbolised the threat that populism represented to their way of life. Writing in Forbes a month after the referendum, one journalist correctly characterised this event as ‘The Populist Revolt Against “Davos Man”’. Kenneth Rapoza observed that ‘Brexit proved once again that Davos Man isn't all-knowing’. He added that Davos Man ‘has the rhetoric down and he knows how to spread the gospel, but beyond that, their near-term predictions lack vision’.
To this day Brexit is perceived as the launchpad for a global populist revolt. Writing in The Washington Post last year Ishaan Tharoor claimed that in the U.S. House drama, you can see the long tail of Brexit’. Most supporters of Brexit have no idea how much consternation their triumph caused to the global cosmopolitan network consisting of Remainers, EU ideologues and their allies in the World Economic Forum. Their sense of alarm was well captured a month after the referendum by the economic commentator Anatole Kaletsky. He noted that ‘Europe’s fear of contagion is justified, because the Brexit referendum’s outcome has transformed the politics of EU fragmentation’. He added that ‘Brexit has turned “Leave” (whether the EU or the euro) into a realistic option in every European country’.
Back in June 2016, when Kaletsky expressed his sense of alarm, the populist movement that led to Britain leaving the EU could still be dismissed as a one-off event. At successive annual meetings of the Davos clique, participants expressed the hope that the threat posed by their populist opponents had waned. The well-known Indian-American commentator, Fareed Zakaria was hopeful that ‘2023 could be the year that exposes populism for the sham that it is’. Numerous anti-populist commentators echoed his sentiment. ‘We seem to have passed peak populism’, predicted Andrew Adonis, a leading British Remainer voice. He described Brexit as ‘an absurd and damaging project based on a host of populist lies’. Adonis’ association of populism with ‘lies’ and dishonesty expressed the principal argument that his side uses to undermine the moral status of their opponent. Through drawing a contrast between the fake populist and the truthful Davos Man people like Adonis imagine that they can undermine the appeal of their political foes.
It has been well over seven years since Kaletsky has raised the alarm about the challenge posed by populism to the institutions of the EU. Since that times movements that are designated as populists have gained considerable momentum. The election of Giorgia Meloni in Italy in 2022 clearly showed that despite the accusation that her party relied on populist lies her party could win. Recently the surprise victory of Geert Wilders and his Freedom Party in Netherlands showed that populism has become a serious force. In France, Marine Le Pen and her party the National Rally is now leading all the opinion polls. A similar pattern of growing support for populist parties is evident in Germany, Austria, Belgium, Sweden and other parts of Northern Europe.
There are some deluded adherents of the WEF’s worldview who actually believe that the principal factor contributing to the success of populist parties is their weaponization of misinformation and lies. However, the inventors of the claim that fake news has become a global existential risk are opportunistic purveyors of this argument. What they are really worried about is their inability to manage democracy. Their elevation of Fake News into a global risk serves as a form of Freudian displacement activity. Unable to face the truth, which is that they lost the argument they point the finger of blame on the influence of lies and misinformation. In this way their inability to convince the electorate is blamed on the nefarious dishonesty of their opponents.
Concern with Fake News not only represents an attempt to discredit political opponents, but it also represents the denigration of the capacity of citizens to make informed choices in elections. From this perspective those who reject the advice and outlook of the WEF oligarchy are not independent minded and intelligent voters, but unthinking people drawn towards the purveyors of Fake News. The logical outcome of this perspective is the conviction that citizens are not fit to make difficult political choices and therefore they must be protected from the consequences of their action.
Paradoxically the Report recognises that the proliferation of misinformation is likely to encourage Governments and media outlets to tighten the policing of public discussions. It warns that as it becomes harder to tell the difference between what is real and fake, press freedom could be threatened. Its reservations about the threat posed by the policing of public debate notwithstanding, the authors of The Report have supported and highlighted the argument used to justify the creation of new systems of gatekeeping and fact checking on the web. Its concern about press freedom is a form of hypocrisy- the compliment that vice pays to virtue.
One final point. Perversely the hysteria promoted about the threat posed by misinformation and Fake News has the effect of disorienting public life. The constant assertion that this or that claim is Fake News leads many people to mistrust mainstream media sources of information Loss of faith in established media institutions and narratives often runs in parallel with the proliferation of fantasies about conspiracies.
Whatever its genuine objective the authors panic about technology assisted misinformation actually contributes to the erosion of trust relations within society. But that’s not a problem for the WEF. What they are worried about is the threat to their way of life posed by democracy.
JAN 13, 2024
Last week, I wrote about ~20 different ideas that I've been thinking about touching on this year. Rather than starting to randomly prioritize, I thought I'd ask my readers (you) what you were most interested in. I very quickly realized I'd made a critical mistake. I should have sent out a survey or something, because inviting responses to the email invited hundreds of replies into my inbox / Twitter DMs that I had to manually review one by one. Live and learn! But after tallying the results, the topic you were most interested was this one:
The Puritans of Venture Capital: Venture used to be a cottage industry. Some firms are still practicing as if nothing really changed. Can they survive? Will capital agglomerators eat their lunch? Or can they co-exist?
It was closely tied with Books 2.0, which I'll try and touch on next week. In fact, it came down to one vote! Mr. Hunter Walk was the final vote pushing Puritans to 133 vs. Books 132. Tight race!
So let's dig in!

People like humble beginnings. I think its because they're easier to understand. It's easier to envision Steve Jobs and Steve Wozniak in a garage with a soldering iron and some microchips. It’s much harder to comprehend a $2.9 TRILLION company with $30 billion of cash, 2 billion active devices and 161K employees.
The larger something gets, the harder it is to comprehend. There's no one single person who comprehends every aspect of a business the size and scale of Amazon, or Microsoft. They are living, breathing organisms.
What's interesting about large, complex organizations is that their marketing rarely even attempts to reflect the size and complexity of the thing. There's no incentive for complex organizations to make themselves easier to understand. The only incentive is to make the story attractive. So the marketing sticks to the narrative of humble beginnings, and only goes as far as "we're just getting started."
When it comes to venture capital, the complexity is far more nuanced.
Venture capital is approaching its dotage. Arguably, the first modern venture firms were American Research and Development Corporation. (ARDC) and J.H. Whitney & Company in 1946. So we're coming up on venture capital's 78th birthday. But as we approach the end of venture's first century, its important to put that in "asset class years", akin to dog years. I've written before about how, relative to asset classes like debt that have been around for thousands of years, 78 just isn't very old.
There are some great books out there that have done a much better job outlining the history of venture capital than I could do. The two I've read recently are VC: An American History by Tom Nicholas and The Power Law by Sebastian Mallaby. I won't re-prosecute the entire evolution of venture. But the one key point is that, in terms of economic scale, venture didn't really start to be relevant until the 80s, and it didn't explode until the dot-com.
In the grand scheme of private markets AUM globally in 2022, venture capital represented just 22% across other asset classes like buyouts, private debt, and real estate.

Venture funding didn't even cross $10B a quarter until ~1999. Total venture capital deployed in the US hit $100B for the first time in 2000 during the first internet bubble.

Next, global venture funding peaked in 2021 at over $600B.

Since then, funding amounts have come back to earth with ~$200B invested in 2023 compared to the $600B+ in 2021.

As a business, venture capital is deploying ~$200-300B a year globally. While the ramp has been volatile with spikes in 2000 and 2021, that trend will likely continue to grow. So the broader question is HOW are those hundreds of billions being deployed?
People often talk about how venture capital is a "cottage industry." This goes back to when small manufacturing operations were being run out of someones home, or cottage. So when people say venture was a cottage industry, its almost like calling it a "Mom & Pop shop."
As an industry, I don't think that's quite right. One of the most famous early venture investments was when Arthur Rock backed the Traitorous Eight in building Fairchild Semiconductor. The money came from Sherman Fairchild, a millionaire businessman, with defense contracts from Raytheon, etc. It was more a function of big companies funding new ideas vs. a side-hustle they did from home.

Mario Gabriele explains how much value was produced from that group:
"As of 2014, an estimated 92 companies trace their roots back to Fairchild Semiconductor founders and employees (some suggest the number is closer to 400), with $2 trillion in value created. That includes Apple, Advanced Micro Devices, and Applied Materials, as well as venture firms like KPCB and Sequoia."
So right off the bat, we're dealing with big dollars, big egos, and big outcomes. Not much of a cottage-based yarn spinning operation, right? So what do people mean when they describe venture capital as a "cottage industry?" They're typically talking more about venture capital partnerships. The mechanisms through which capital deployment decisions were made, and the organizations behind those decisions.
Historically, firms like Benchmark or Accel had teams of ~6-9 partners. For example, Accel in July 2006 had 9 in the US.
…More
JAN 10, 2024
VC has long been a cottage industry that has seen little innovation. This is particularly surprising as VCs themselves are the ones backing the most disruptive businesses. They have a front-row seat when it comes to the adoption of new technologies and business model innovation, yet in the first 60 years following the industry’s inception in the 1950s, the only change was the shift from pen & paper to computer & MS Office.
The reason for this lack of innovation is most likely the absence of competition and pressure to change. Access to capital for startups with less traditional business models and a lack of collaterals has historically been heavily constrained. This is why the VC industry evolved in the first place and, unfortunately, this reality is still true for the majority of new startups today.
As a result of a supply-side constrained market, VCs could long afford to be picky and weren’t forced to innovate. Until the early 2010s. Maturing ecosystems and cheap money policy increased new firm formation but also the assets under management per firm. Access to capital became gradually more available for startups and the shift from a supply-side constrained market to a more balanced, partially in 2020 and 2021 even demand-side constrained market, suddenly forced investors to get their act together.
Ever since I joined the VC industry in 2017, I’ve been observing, pushing, and writing about growing innovation in this rusty industry. In today’s episode, I’d like to zoom out again and look at the major trends and predictions for 2024 and beyond. Let’s dive in!

The VC industry faces a range of challenges. Overpriced portfolio companies, lack of exit channels, DPI and performance issues, fundraising struggles, generational transition, diversity, you name it.
In the past decade, the VC market has seen only one direction: up and to the right. Following 2022, however, this has suddenly changed and I expect a natural selection in the next year or two.

Looking at the drivers of this prediction, I’d like to double-click on the most dominant market-related components.
Venture returns are power-law distributed. Few outsized winners deliver the majority of returns. For this logic to work, VCs need exit channels like IPOs and M&A with significant liquidity.
Deflating public markets end of 2022 and the resulting liquidity crunch were anything but helpful. Since then, many VCs have sat on piles of paper money but cannot divest and deliver DPI.
This translates directly to LPs which in turn have limited resources for new engagements and re-ups. Consequently, their deployment strategy for 2023 and at least until the re-opening of IPO windows towards the end of 2024 or even early 2025 is extremely selective.
“The $67bn raised by US VCs in 2023 is the lowest annual total since 2017 and represents a 60 per cent drop from the $173bn raised in 2022, the peak year for fundraising, according to analysis by private markets data provider PitchBook and the National Venture Capital Association. Globally, in 2023 venture investors raised the lowest level of capital since 2015” (source FT Jan 5 2024)
Based on feedback from various institutional LPs, most of them cut back on new engagements with emerging managers and become hyper-focused on performance KPIs for 3rd generation+ GPs.
Second fund generations are a bit of a different breed as they tend to follow the inaugural fund about 3 to 4 years later and are unlikely to deliver tangible performance that soon. Thus, whenever LPs invest in first fund generations, they typically subscribe (at least in their mind) to the second generation too.
“LPs are aware that when the second fund comes along, they won’t yet know how well the first fund has performed,” says Jeremy Uzan. “In a way, they already knew that they would back fund one and fund two” (source Singular Fund II Announcement, Dec 14 2023)
I expect several GPs with insufficient exit track records and/or other challenges to disappear in the next year or two.
This will most likely hit firms that were founded in the rising market of 2010-2018ish, as they’re old enough for LPs to require KPIs (raising 3rd generation+) but too young to have distributed tangible money to their LPs. Hereof, they either cut headcount to extend runway into hopefully more friendly market environments, close shop, or join forces with other firms.

Though VC as an industry has historically seen very little M&A, recent activities (driven by different motivations, some even from a mutual position of strength, examples above) might provoke a broader appetite for established VC firms to acquire strategic assets like a brand, portfolio, or an investment team from struggling GPs to enter new markets.
Our partnership at Earlybird has seen three major downturns since our firm’s inception in 1997: Dotcom, GFC, and COVID. Based on first-hand experience and internal analyses, we find that the private venture capital cycle can be split into 4 major phases.
…More
JAN 8, 2024
The truism “You are what you eat” has never been as apt as in machine learning models and products. The impressive proof-of-concepts built by experts and hobbyists alike, along with a growing number of production use cases, have only driven the demand for data.
Good data, web data, free data.
Data that’s structured and data that’s unstructured (especially unstructured).
Junk data, organic & free range ethical data.
Is there really a difference when it comes to delivering incredible experiences and new features with the God-tier models on the market? Isn’t all data “good data” for the purposes of machine learning at this point?
Let’s take a stroll back through some of machine learning’s greatest examples of data-related flops for a quick refresher of how data quality can have, and has had, detrimental (and measurable) individual and societal impacts, at least until those projects were pulled.
Most companies aren’t quick to publicize their failings, as retrospectives and post-mortems are used more as internal exercises to prevent future hits to the company’s current share price rather than for external-facing accountability. That’s not to say there aren’t examples, especially in sensitive areas like medicine, education, and civil rights.
For example iTutorGroup, a tutoring company thought to be using ML-powered recruiting software, recently settled a lawsuit alleging that their screening process performed age-discrimination by screening out candidates “women aged 55 or older and men who were 60 or older.” Was a filter intentionally set or could this have been an incident where the training data was biased towards younger candidates, based on past hiring decisions potentially enshrining the prejudices of the hiring managers?
This wouldn’t be the first time that poor design and data collection as well as model training and experimentation design would have failed an ML product. Especially in ways that were detrimental for the individuals unfairly caught up in the dragnets of AI. For example, the growing number of cases of facial recognition being used to wrongfully arrest people of color has started to shed light on how poorly designed and evaluated the models, as well as the datasets being used to train these models, are and the ubiquitous lack of best practices.
Even if the data is “good” from the perspective of reflecting the biases and systemic injustices present in real-world conditions, do we really want a mirror image? For example, do we want to automate into existence a world where inequitable access to medical care is being used to exclude Black patients from being identified as in need of “high-risk care management” programs by hospitals and insurance companies? (Additional links: Loyola Health Law Review, Racial Bias Found in a Major Health Care Risk Algorithm)
Poor data handling doesn’t just hurt “we the people,” it’s also costly.
As an analyst creating financial forecasting models, what if you were using outdated or bad financial data? Aside from inaccurate predictions and forecasts, resulting in poor investment decisions and significant financial losses for the model’s end users, you could end up losing your company customers (and yourself a job).
Are you a clinical director test-driving a new diagnostic tool meant to replace your favorite and expensive radiologist? You’ve just found out that the underlying model was initially trained with mislabelled data and certain instances were tagged with the wrong diagnosis, such as Epic’s Sepsis Model that had to be revamped. Have fun with those malpractice lawsuits.
The irony is that even as we struggle with data quality on easy mode, for analytic or simple machine learning use cases, data quality is the next “big thing” in a future where generative AI, multimodal ML systems, and streaming will become the norm. However, it could be argued that data quality has been the real hero the entire time. After all, the success behind GPT-4 (and even GPT-3) had as much (if not more) to do with their investments in their data as their algorithmic research.
For example, Karpathy’s presentation on the GPT assistant training pipeline highlighted the huge corpus of raw (ugly) data initially used, then an additional corpus of data created by contractors, then additional rounds of manual evaluation and prompt response, during which multiple iterations of models were trained (parts of this process constitute a technique called RLHF, Reinforcement Learning From Human Feedback).
In fact, if you consider all the recent winners and leaders of the generative AI wave, from big cloud enterprise companies to lean and mean cutting-edge startups, it’s clear that the secret sauce was in how they architected their data engines.
The implication is that for companies looking to build AI or use AI as part of their services and core offerings, whether that be developing the next large model (maybe even the next large vision model or multimodal model) or developing an autonomous vehicle service, the most important area of opportunity is developing the processing and infrastructure for systematically engineering data quality into their pipelines. Building continuous improvement and iteration, whether the data is structured or unstructured, is a key prerequisite to ensuring a company isn’t left behind and becomes a relic in a data-centric future.
Machine learning and generative AI aren’t going to kill your business. Dall-E and Midjourney could have been the death knell for Adobe– instead, more than 1 million images have been generated using Firefly.
One of the worst mistakes companies and data developers could make is thinking it’s already too late to leverage their data to create powerful products and services. Anyone who’s been involved in data governance efforts at an enterprise company with a legacy stack and a sprawl of data sources would be hard-pressed to see how they could close the gap between themselves and their more svelte competitors.
Yet astute data developers know they shouldn’t fear the reaper – the emergence of Gen-AI is pushing companies to acknowledge the poor state of their data, especially as the shortage of data for training LLMs is on the horizon. Running out of training data, especially as the foundation models keep growing and consuming publicly available and high-quality data at a rate far faster than is being replenished seems like a weird problem and an opportunity.
How? For starters, most companies and organizations can’t afford the years and millions of dollars of funding to create the equivalent of Google Brain or DeepMind, exploring new model architectures that may offer only incremental improvements over all the LLMs and foundation models available on the market (both proprietary or open-source). But data is something all companies have in abundance: data that is specific, private, and custom to the company and its use cases. And even if you don’t have a readily available dataset, you can always buy data the way Open-AI, Google, and Meta have (amongst many others).
Companies that have invested in enabling a data flywheel at the product level know that they’ll be okay, especially because the next challenge with the democratization of LLMs will demand differentiation at the data level, especially of higher quality data for fine-tuning models (rather than swamp loads of poor quality data).
You’ve bought into the possibility that data quality can be impactful, not just for the current and most prevalent analytic and machine learning use cases. You’ve also acknowledged that there seems to be a relationship between your company’s current level of data quality and the myriad opportunities to capitalize on the recent advancements of generative AI. And you’re eager to leverage the data you already have (especially since it means dodging the copyright issues other companies seem to be running into) and ready to make the necessary investments in data quality.
The question then becomes: What drivers of data quality in the machine learning lifecycle can you influence?
In order to answer this critical question, we’ll need to define data quality (especially from the machine learning perspective) and the specific problems that data-centric AI addresses (and how).
It’s worth starting with the different touch points between data and code in typical software systems and projects. Quite simply, the data is secondary. The application logic is primary.
What do I mean? I mean that the actual logic of the application or microservice doesn’t depend on the data, its structure, its schema, or the distribution of its values. If you want a website that loads fast and lists all items being added in a specific order or based on filters, you don’t really need to know too much about the items being listed other than the associated metadata or attributes that need to be called or returned. You’re not necessarily changing every site page dynamically depending on the items themselves, and even if you are, it’s still based on some set attribute like category (“Baby Supplies” versus “Arts and Crafts,” or “Workout Equipment” versus “Wood Working”). You could take Target’s entire SKU catalog and swap it with Amazon or Walmart, and as long as the data specs match up, you’re good. (Sans special products, prices, and rewards. IYKYK.)
Effective data management, particularly in the formulation of a well-suited training dataset, holds significance for enhancing model performance & improving training efficiency during pretraining & supervised fine-tuning phases. – [2312.01700] Data Management For Large Language Models: A Survey
A machine learning model is its data and all the sets of data required to train (or fine-tune) a model. The three types of data artifacts created throughout the machine learning lifecycle where quality is key include the:
Training Data
Inference Data
Maintenance Data (including metadata).


How does quality play a role in each of these datasets?
Manual exploratory data analysis is an important process (and rite of passage for new data developers) because oftentimes, that’s where data quality issues are initially identified. Early exploration can often yield valuable information on the relationships of different entities and processes within the business to each other. Scatter plots, heat maps, bar charts, correlation analysis, etc., can reveal opportunities to apply transformation to the data in the form of feature engineering or the creation of inputs to downstream machine learning models. Well-defined, normalized, and consistent data (as well as documentation) can speed up this process significantly, cutting down on back & forths.
We support journalism, partner with news organizations, and believe The New York Times lawsuit is without merit.

January 8, 2024
Our goal is to develop AI tools that empower people to solve problems that are otherwise out of reach. People worldwide are already using our technology to improve their daily lives. Millions of developers and more than 92% of Fortune 500 are building on our products today.
While we disagree with the claims in The New York Times lawsuit, we view it as an opportunity to clarify our business, our intent, and how we build our technology. Our position can be summed up in these four points, which we flesh out below:
We collaborate with news organizations and are creating new opportunities
Training is fair use, but we provide an opt-out because it’s the right thing to do
“Regurgitation” is a rare bug that we are working to drive to zero
The New York Times is not telling the full story
We work hard in our technology design process to support news organizations. We’ve met with dozens, as well as leading industry organizations like the News/Media Alliance, to explore opportunities, discuss their concerns, and provide solutions. We aim to learn, educate, listen to feedback, and adapt.
Our goals are to support a healthy news ecosystem, be a good partner, and create mutually beneficial opportunities. With this in mind, we have pursued partnerships with news organizations to achieve these objectives:
Deploy our products to benefit and support reporters and editors, by assisting with time-consuming tasks like analyzing voluminous public records and translating stories.
Teach our AI models about the world by training on additional historical, non-publicly available content.
Display real-time content with attribution in ChatGPT, providing new ways for news publishers to connect with readers.
Our early partnerships with the Associated Press, Axel Springer, American Journalism Project and NYU offer a glimpse into our approach.
Training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents. We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness.
The principle that training AI models is permitted as a fair use is supported by a wide range of academics, library associations, civil society groups, startups, leading US companies, creators, authors, and others that recently submitted comments to the US Copyright Office. Other regions and countries, including the
That being said, legal right is less important to us than being good citizens. We have led the AI industry in providing a simple opt-out process for publishers (which The New York Times adopted in August 2023) to prevent our tools from accessing their sites.
Our models were designed and trained to learn concepts in order to apply them to new problems.
Memorization is a rare failure of the learning process that we are continually making progress on, but it’s more common when particular content appears more than once in training data, like if pieces of it appear on lots of different public websites. So we have measures in place to limit inadvertent memorization and prevent regurgitation in model outputs. We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use.
Just as humans obtain a broad education to learn how to solve new problems, we want our AI models to observe the range of the world’s information, including from every language, culture, and industry. Because models learn from the enormous aggregate of human knowledge, any one sector—including news—is a tiny slice of overall training data, and any single data source—including The New York Times—is not significant for the model’s intended learning.
Our discussions with The New York Times had appeared to be progressing constructively through our last communication on December 19. The negotiations focused on a high-value partnership around real-time display with attribution in ChatGPT, in which The New York Times would gain a new way to connect with their existing and new readers, and our users would gain access to their reporting. We had explained to The New York Times that, like any single source, their content didn't meaningfully contribute to the training of our existing models and also wouldn't be sufficiently impactful for future training. Their lawsuit on December 27—which we learned about by reading The New York Times—came as a surprise and disappointment to us.
Along the way, they had mentioned seeing some regurgitation of their content but repeatedly refused to share any examples, despite our commitment to investigate and fix any issues. We’ve demonstrated how seriously we treat this as a priority, such as in July when we took down a ChatGPT feature immediately after we learned it could reproduce real-time content in unintended ways.
Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third-party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate. Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.
Despite their claims, this misuse is not typical or allowed user activity, and is not a substitute for The New York Times. Regardless, we are continually making our systems more resistant to adversarial attacks to regurgitate training data, and have already made much progress in our recent models.
We regard The New York Times’ lawsuit to be without merit. Still, we are hopeful for a constructive partnership with The New York Times and respect its long history, which includes reporting the first working neural network over 60 years ago and championing First Amendment freedoms.
We look forward to continued collaboration with news organizations, helping elevate their ability to produce quality journalism by realizing the transformative potential of AI.
January 11, 2024 by Jeff Jarvis
ai, artificial intelligence, congress, copyright, journalism
Well, that was surreal. I testified in a hearing about AI and the future of journalism held by the Senate Judiciary Subcommittee on Privacy, Technology, and the Law. Here is my written testimony and here’s the Reader’s Digest version in my opening remarks:
It was a privilege and honor to be invited to air my views on technology and the news. I went in knowing I had a role to play, as the odd man out. The other witnesses were lobbyists for the newspaper/magazine and broadcast industries and the CEO of a major magazine company. The staff knew I would present an alternative perspective. My fellow panelists noted before we sat down — nicely — that they disagreed with my written testimony. Job done. There was little opportunity to disagree in the hearing, for one speaks only when spoken to.
What struck me about the experience is not surprising: They call the internet an echo chamber. But, of course, there’s no greater echo chamber than Congress: lobbyists and legislators agreeing with each other about the laws they write and promote together. That’s what I witnessed in the hearing in a few key areas:
Licensing: The industry people and the politicians all took as gospel the idea that AI companies should have to license and pay for every bit of media content they use.
I disagree. I draw the analogy to what happened when radio started. Newspapers tried everything to keep radio out of news. In the end, to this day, radio rips and reads newspapers, taking in and repurposing information. That’s to the benefit of an informed society.
Why shouldn’t AI have the same right? I ask. Some have objected to my metaphor: Yes, I know, AI is a program and the machine doesn’t read or learn or have rights any more than a broadcast tower can listen and speak and vote. I spoke metaphorically, for if I had instead argued that, say, Google or Meta has a right to read and learn, that would have opened up a whole can of PR worms. The point is obvious, though: If AI creators would be required by law to license *everything* they use, that grants them lesser rights than media — including journalists, who, let’s be clear, read, learn from, and repurpose information from each other and from sources every day.
I think there’s a difference in using content to train a model versus producing output. It’s one matter for large language models to be taught the relationship of, say, the words “White” and “House.” I say that is fair and transformative use. But it’s a fair discussion to separate out questions of proper acquisition and terms of use when an application quotes from copyrighted material from behind a paywall in its output. The magazine executive cleverly conflated training and output, saying *any* use required licensing and payment. I believe that sets a dangerous precedent for news media itself.
If licensing and payment is required for all use of all content, then I say the doctrine of fair use could be eviscerated. The senators argued just the opposite, saying that if fair use is expanded, copyright becomes meaningless. We disagree.
JCPA: The so-called Journalism Competition and Preservation Act is a darling of many members of the committee. Like Canada’s disastrous Bill C-18 and Australia’s corrupt News Media Bargaining Code — which the senators and the lobbyists think are wonderful — the JCPA would allow large news organizations (those that earn more than $100,000 a year, leaving out countless small, local enterprises) to sidestep antitrust and gang together and force platforms to “negotiate” for the right to link to their content. It’s legislated blackmail. I didn’t have the chance to say that. Instead, the lobbyists and legislators all agreed how much they love the bill and can’t wait to try again to pass it.
Section 230: Members of the committee also want to pass legislation to exclude generative AI from the protections of Section 230, which enables public discourse online by protecting platforms from liability for what users say there while also allowing companies to moderate what is said. The chair said no witness in this series of hearings on AI has disagreed. I had the opportunity to say that he has found his first disagreement.
I always worry about attempts to slice away Section 230’s protections like a deli balogna. But more to the point, I tried to explain that there is nuance in deciding where liability should lie. In the beginning of print, printers were held liable — burned, beheaded, and behanded — for what came off their presses; then booksellers were responsible for what they sold; until ultimately authors were held responsible — which, some say, was the birth of the idea of authorship.
When I attended a World Economic Forum AI governance summit, there was much discussion about these questions in relation to AI. Holding the models liable for everything that could be done with them would, in my view, be like blaming the printing press for what is put on and what comes off it. At the event, some said responsibility should lie at the application level. That could be true if, for example, Michael Cohen was misled by Google when it placed Bard next to search, letting him believe it would act like search and giving him bogus case citations instead. I would say that responsiblity generally lies with the user, the person who instructs the program to say something bad or who uses the program’s output without checking it, as Cohen did. There is nuance.
Deep fakery: There was also some discussion of the machine being used to fool people and whether, in the example used, Meta should be held responsible and expected to verify and take down a fake video of someone made with AI — or else be sued. As ever, I caution against legislating official truth.
The most amusing moment in the hearing was when the senator from Tennessee complained that media are liberal and AI is liberal and for proof she said that if one asks ChatGPT to write a poem praising Donald Trump, it will refuse. But it would write a poem praising Joe Biden and she proceeded to read it to me. I said it was bad poetry. (BTW, she’s right: both ChatGPT and Bard won’t sing the praises of Trump but will say nice things about Biden. I’ll leave the discussion about so-called guardrails to another day.)
It was a fascinating experience. I was honored to be included.
For the sake of contrast, in the morning before the hearing, I called Sven Størmer Thaulow, chief data and technology officer for Schibsted, the much-admired (and properly so) news and media company of Scandinavia. Last summer, Thaulow called for Norwegian media companies to contribute their content freely to make a Norwegian-language large language model. “The response,” the company said, “was overwhelmingly positive.” I wanted to hear more.
Thaulow explained that they are examining the opportunities for a native-language LLM in two phases: first research, then commercialization. In the research phase now, working with universities, they want to see whether a native model beats an English-language adaptation, and in their benchmark tests, it does. As a media company, Schibsted has also experimented with using generative AI to allow readers to query its database of gadget reviews in conversation, rather than just searching — something I wish US news organizations would do: Instead of complaining about the technology, use it to explore new opportunities.
Media companies contribute their content to the research. A national organization is making a blanket deal and individual companies are free to opt out. Norway being Norway — sane and smart — 90 percent of its books are already digitized and the project may test whether adding them will improve the model’s performance. If it does, they and government will deal with compensation then.
All of this is before the commercial phase. When that comes, they will have to grapple with fair shares of value.
How much more sensible this approach is to what we see in the US, where technology companies and media companies face off, with Capitol Hill as as their field of play, each side trying to play the refs there. The AI companies, to my mind, rushed their services to market without sufficient research about impact and harm, misleading users (like hapless Michael Cohen) about their capabilities. Media companies rushed their lobbyists to Congress to cash in the political capital earned through journalism to seek protectionism and favors from the politicians their journalists are supposed to cover, independently. Politicians use legislation to curry favor in turn with powerful and rich industries.
Why can’t we be more like Norway?
JAN 5, 2024

There are two ways to think about ChatGPT.
One way is that it’s exceptionally good at doing stuff. We give it an instruction, and, because it’s been trained on an enormous corpus of human language, it can respond in a way that would, if anything, fail the Turing test for being too capable. In an instant, it can write a French sonnet about New Year’s Eve; it can create an argument for why it should be a felony to write songs in C major; it can understand a 700-word blog post from which all the vowels have been removed. For tasks like these—writing emails, creating lesson plans, finding and booking a restaurant for a six-person get-together next Friday in New Orleans—ChatGPT is valuable because of what it can do.
The second way to think it is that it knows things. We ask an LLM like ChatGPT a question; it tell us the answer. It’s valuable because it’s read every encyclopedia and textbook and Reddit post in the world, and can summarize—and in some cases, recreate—what those things say. Though LLMs don’t store this information in a traditional sense—there is no file in GPT-4 that contains the full text of the Declaration of Independence, for example—ChatGPT can still rewrite the entire document. In this way, LLMs aren’t useful because of what they can do, but because of what they know—like who Calvin’s babysitter was, who scored the most points in a WNBA game, and which song starts with the notes “da da da dum.”
This second version of ChatGPT—the one that, above all, knows things—is the version that caused people to declare Google dead, and caused Google to freak out, when OpenAI released it. Whereas Google can find links that might answer your questions, ChatGPT answers them directly. Its appeal was as the ultimate lazyweb.1
If this is the role that LLMs come to occupy—Google 2.0, basically—copyrighted content from books and news publishers is immensely valuable to OpenAI. To replace Google, ChatGPT would need to “know” most of what Google can find—and Google can search the entire internet, including copyrighted websites. Without access to that content, ChatGPT isn’t a better Google; it’s a chatbot for summarizing Wikipedia and Reddit.
If ChatGPT ultimately occupies the first role—a bot that does stuff; an agent—OpenAI doesn’t need copyrighted material. An AI agent would be useful for the same reasons that human agent is useful, and human agents are useful because they can complete complex tasks based on ambiguous instructions. They don’t need to know that much; they need to be able to communicate, reason about problems,2 and look stuff up. And just as a human assistant can be a good assistant without memorizing the script of Star Wars or what was said in the Wall Street Journal yesterday, an LLM can probably be trained to be a useful agent without being trained on copyrighted content. Give it enough high-quality text, from any source, and it can learn to talk as well as any of us.3
Despite the initial panic at Google, I’d be surprised if ChatGPT comes for search. Though that’s partially because LLMs aren’t, on their own, reliable narrators of fact, it’s much more because the economic value of agents that do stuff is potentially far greater than the economic value of a chatbot that knows stuff. “We can help your accountants answer common questions about tax regulations” is a nice pitch, but a fundamentally incremental improvement over Google; “we can create an infinite army of cheap digital labor that can do a lot of the tasks your employees do” is transformative. The frontier of ChatGPT potential isn’t replacing Google, but in using Google4—and in the same way that the cost of manual labor made industrialization all but inevitable, the cost of skilled labor probably makes the agentization of fake email jobs5 all but inevitable too.6
In other words, for the enhanced search engine that OpenAI is today, copyrighted content is necessary. Omniscient oracles need to read the news to be omniscient. But for the autonomous agents they’ll likely become, copyrighted material is simply convenient—news websites, for example, are generally reliable, accurate, well-written, constantly produced in large quantities, and can be collected from relatively centralized sources. But any sufficiently diverse body of text will do.

Art by Clark Miller
By Alexandra Lindsay and Greg Dale
Jan. 5, 2024 9:03 AM PST
Want to feel old? It was more than five years ago that director Jordan Peele teamed up with BuzzFeed to create a viral deepfake video of Barack Obama uttering a series of improbable lines, a clip meant to serve as a public service announcement for the dangers of how technology could be used to manipulate public opinion. “It may sound basic, but how we move forward in the age of information is going to be the difference between whether we survive or become some kind of fucked-up dystopia,” Peele as Obama ventriloquist said.
Now, on the precipice of the 2024 election season, that dystopia is just around the corner, courtesy of artificial intelligence—or so say some of the doomsayers. Last week, Fortune magazine quoted Oren Etzioni, an AI expert and professor emeritus at the University of Washington, who imagines a coming flood of AI-fabricated content showing President Joe Biden being rushed to the hospital or depicting a run on banks. “I am completely terrified,” Etzioni said.
And in June, former Google chair Eric Schmidt told CNBC that AI-generated misinformation in this year’s election was one of the biggest short-term dangers from the technology. “The 2024 elections are going to be a mess because social media is not protecting us from false generated AI,” Schmidt said.
We’re not so sure about that. As technologists with years of experience in the political trenches, we have collectively worked on digital strategy for hundreds of campaigns at the federal, state and local level, in addition to national voter mobilization campaigns.
From our perspective, the impact of AI on this election is likely to be more nuanced than many people predict (including the American public: A recent poll showed that 58% of American adults are concerned about the use of AI increasing the spread of false information during the 2024 presidential election).
Don’t get us wrong—there are real reasons to worry about how rapid advances in AI technology will impact elections. We’ve just emerged from several election cycles in which everyone from Russian agents to presidential candidates themselves spread disinformation widely via social networks. In this new era, foreign adversaries, campaign managers and meme lords will increasingly explore and exploit generative AI’s newfound abilities to make an impact on the political scene. But the AI election apocalypse isn’t likely to happen this year.
Here are our predictions for how AI will (and will not) impact the 2024 U.S. election cycle:
Alexandra Lindsay is co-author of AI Political Pulse, a newsletter dedicated to the politics and policy of artificial intelligence. She is the board chair at Close the Gap California, a nonprofit that recruits women to run for the California State Legislature. She formerly served as Head of Product and Operations at Tech for Campaigns, a nonprofit focused on bringing advanced digital marketing and data science to politics.
Greg Dale is co-author of AI Political Pulse, a newsletter dedicated to the politics and policy of artificial intelligence, and is a marketing and product consultant. He formerly served as CEO of Tech for Campaigns, working with hundreds of Democratic campaigns and independent expenditures at all levels of the ballot
JAN 7, 2024
A few disclaimers before we get into this boondoggle:
I’m not a reporter - I’m a humble tech CFO who thinks this (or whatever this is) is an important issue for the venture backed ecosystem
Also, I’m holding an infant in my left hand, typing with my right, and haven’t shaved since Thursday - not exactly TechCrunch or Axios material
I’m currently a (happy? - tbd) Carta customer
I believe the financial infrastructure they’ve built for private markets revolutionized transparency, trust, and liquidity for startups
I interviewed their CFO Charly Kevers on my podcast Run the Numbers last year, and I think he’s one of the best in the game
OK, let’s ride!
Mostly metrics is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Upgrade to paid
On Friday, Linear Founder, and Carta customer, Karri Saarinen dropped a bombshell.

The one sentence summary would be:
Carta offers two popular products - cap table management and a platform for secondary transactions - and the founder of a company who uses Carta for cap table management thinks Carta is using confidential shareholder info to facilitate trades of his company’s stock, without his approval, on their trading platform.
But first - what are secondary transactions? And when do they occur?
Privately held venture-backed companies often grant their employees stock options as a form of compensation. It also helps attract talent at a lower salary - the company gives you less cash today in exchange for unlimited upside tomorrow.
However, these stock options are typically not liquid until the company goes public, which can take several years. As a result, many pre-IPO companies allow their employees to participate in tender offers, commonly called secondary transactions, which allow them to sell a portion of their vested shares to outside investors.
A tender offer is a liquidity event in which a company, investor, or group of investors propose to buy a fixed number of shares from existing shareholders at a set price. Tender offers can be made for both private companies and public companies, with recent examples OpenAI and Stripe.
It’s important to note that the company does not get any money from a secondary transaction. The balance sheet does not change. While primary dollars are used to fund future operations, M&A, and to hire more talent, there are no new shares created in a secondary transaction, as they merely change hands.
They’re a total pain in the ass to administer. The company merely acts as an approver
Tender offers typically occur in conjunction with a later stage fundraise (Series C and beyond). This is the sweet spot where founders have been at it for long enough to take a little off the table.
Anytime before then is usually a pretty big red flag to investors - it would be suspect if a Series A founder wanted to line their pockets before the company has achieved product market fit.
That’s why all tender offers require board approval.
Here’s the crux of it all - is Carta using their private, confidential, asymmetrical information to shake loose transactions, which they would benefit from, without founder / CEO / CFO / Board approval?
Whew!…..
Connie Loizos @cookie / 7:30 PM PST•January 8, 2024

Image Credits: Carta
Roughly 72 hours after a prominent startup customer complained that Carta was misusing information with which it was entrusted — scaring many of Carta’s tens of thousands of other customers in the process — Carta is exiting the business that landed it in trouble with the customer.
Carta co-founder and CEO Henry Ward posted on Medium tonight that: “Because we have the data, if we are trading secondaries, people will always worry that we are using the data, even if we are not. So we have decided to prioritize trust, and exit the secondary trading business.”
It’s a dramatic turn of events for 14-year-old Carta, which originally focused on cap table management software but began over time to evolve into a “private stock market for companies” to take advantage of the network of companies and investors that already use its platform and into which it has insights. The big idea was to become the transfer agent, brokerage and clearinghouse for all private stock transactions in the world.
While the move made Carta more valuable in the eyes of its venture backers — a company has to scale, after all! — it put the company on dangerous footing after Finnish CEO Karri Saarinen posted on LinkedIn on Friday that Carta was using information about his company’s investor base to try to sell its shares to outside buyers without the company’s knowledge or consent.
Wrote Saarinen, whose project management software company Linear is four years old and a Carta customer: “As a founder it feels kind [of] shitty that Carta, who I trust to manage our cap table, is now doing cold outreach to our angel investors about selling Linear shares to their non disclosed buyers.” Saarinen continued: “They never contacted us (their customer) about starting an order book for Linear shares. The investor they reached out to is a family member whose investment we never published anywhere. We and they never opted in to any kind of secondary sales. Yet Carta Liquidity found their email and knew that they owned Linear shares.”
While Ward apologized publicly to Saarinen, blaming a rogue employee who “violated our internal procedures and went out of bounds reaching out to customers they shouldn’t have,” Saarinen continued the discussion very publicly, saying he had identified numerous other founders whose investors had also been contacted by Carta representatives without their knowledge.
In his post tonight, Ward downplayed the impacts of ending secondary trading on Carta, saying the revenue derived from the practice is minuscule compared with Carta’s other business offerings. According to Ward, Carta’s cap table business “is about $250M/year, fund administration is about $100M, private equity is about $20M, and the secondary trading business is about $3M.” Carta, he added, has done a “decent job of building the cap table business, an ok job at fund admin (but feeling the growing pains), and an abysmal job at the secondary business.”
Further, he continued, having precious customer data that others do not isn’t the superpower that outsiders may think — certainly not if Carta is going to be a good actor in the private company ecosystem.
Ward, widely known to be brusque, strikes an uncharacteristically humble tone in the Medium post, writing, “ALL of my ideas around liquidity — auctions, investor matching, secondary trading, open tender offers, have not worked. I might not be the entrepreneur that can solve this problem.” Indeed, he continued, “Carta might not be the company that can solve this problem. Many people think we are best poised to solve liquidity because we have cap table data. But that same argument is used for data products. People say ‘You have all the data so you should put Pitchbook out of business!’ But it is precisely because we have the data, that we can’t use it. It is our customers’ data, not ours. That’s why in ten years, Carta has never released a data product. I use Pitchbook and TechCrunch when I research a company before I meet the CEO.”
“Having ground truth data is not an advantage if we can’t use it. And it is a disadvantage if people think we use it,” added Ward.
To Carta’s credit, the decision to back out of the secondary sales business came quickly; Carta also seemed to have little choice, with many founders threatening to move their startups’ business elsewhere after the events of this past weekend.
As founder Sim Desai of the financial services startup Hiive wrote on LinkedIn yesterday, [A]side from [Carta’s] apparent breach of trust (possible to fix) and their lack of expertise (hard to fix), Carta faces another impossible conflict between these two business models. Even if they are not using their customers’ confidential information, it is the optics of a potential breach that will stand in the way.”
How the move impacts Carta’s own valuation remains to be seen, as does whether the company sticks to its guns once the startup market rebounds — along with demand for secondary shares.
In the meantime, if you missed the row with Linear that set tongues wagging over the weekend, you can read our earlier coverage here.
On Friday we had an internal policy violation that affected three companies. I’ve been in touch with the founders and I’m appalled we made that mistake and it should never have happened. It is unacceptable and we’ve dealt with the violation on Saturday morning and are continuing the investigation to make sure it never happens again.
Let me share our framework on data privacy and access controls to hopefully address concerns from this weekend. For a deeper dive, I will bucket data privacy into four buckets with different rules that I will cover separately.
1. Public Disclosures: We can only publish aggregate and anonymous data. So we can say things like there are 34K startups on Carta, or the average Series A startup has 25 employees, etc… However, we cannot say Acme Startup has 41 shareholders or the PPS is $13.24. You will see this type of aggregate anonymous information frequently in our data reports.
2. Internal Systems Disclosures: We can use cap table data for onboarding and internal systems development. So for example, we can load cap table data into dashboards for audit, we can write health checks to make sure cap table reports are correct, we can run machine learning algorithms to predict when you need a 409A, etc… We can use cap table data to help us improve the software or customer experience. This also includes things like when support teams access cap tables (through an approval and audit system) or when a customer needs help correcting or updating their cap table. All human access to cap tables is tracked and audited.
3. Sales & Marketing: Lastly, we can market to our customers and users. For example, we can offer new products to help companies with employee compensation, taxes, and expense reporting. Occasionally we have offered products directly to employee shareholders. For example, in the past we have offered stock based loan products to employees of certain companies where employees can access loans to exercise their stock. But when we offer these products to employees we only do it in collaboration with the company. The company has to approve the program for their employees for us to offer it.
4. CartaX: CartaX is a separate product that operates as an opt-in marketplace where investors are invited to enter bids and asks on different companies. At any given time we have about one hundred companies that are in the marketplace. Where CartaX and the cap table business converge is if we match a trade in the marketplace, we go to the company and ask if they will allow it. If the company allows it, we use their cap table to execute the trade. If the company doesn’t allow it, we stop the trade. We do not and will never trade without company consent.
In the case of Linear and two other companies, we had an internal breach of protocol and we contacted someone directly on the cap table. That never should have happened and is absolutely a breach of our privacy protocols. And we have addressed it over the weekend.
The second mistake might be whether we are too close to the cap table business to be helping on liquidity. We started CartaX five years ago to help founders and companies with liquidity and it has mostly been a net positive for founders, employees, and shareholders. But even if we do everything perfectly and make zero mistakes, perhaps just the appearance of being in the liquidity business makes us seem compromised. Everything we do must be grounded in trust and if being in the liquidity business compromises that trust, perhaps we need to reevaluate that offering.
I will think about this and come back with more thoughts in the coming months. If you have a perspective on whether Carta should be helping companies with liquidity, please reach out to me. I’d love to hear them.
I’m sorry for scaring everybody about this. After ten years of managing cap tables across 40,000 startups, I promise we aren’t compromising anyone’s data. We won’t be here if you don’t trust us. Trust, transparency, and integrity is our most important currency. If you would like to chat with me more one-on-one, please email me at henry.ward@carta.com and we can set up a zoom.
January 12, 2024
Ever since Garry Tan came on as Y Combinator CEO last year, there have been changes.
Last March, Tan cut its late-stage investing and laid off 17 investors, and he shrank the size of YC’s batches. Less known is that the startup accelerator also moved its headquarters. After spending 17 years operating out of Mountain View, Y Combinator moved its operations up north to the Dogpatch neighborhood in San Francisco and into Pier 70, according to city records reviewed by Fortune and confirmed by Y Combinator.
In an interview, Y Combinator CEO Garry Tan said it was important to him that YC be as close to the cutting edge of artificial intelligence innovation as possible (OpenAI and Anthropic and “a lot of the top talent” in artificial intelligence are in San Francisco versus Silicon Valley, he says). Not to mention—the majority of Y Combinator partners, including himself, live in San Francisco, he adds.
These days, “you sort of have to be in San Francisco,” Tan says, noting the importance of the accidental run-ins that happen when people are out and about. “Chances are the people around you are thinking about and talking about technology, and especially A.I. That’s really special,” Tan says. He also pointed out that YC data shows that startups that were built in San Francisco were more likely to succeed than their peers.
The move from Mountain View, which happened in the spring, right before YC welcomed its summer batch of startups, is also part of Y Combinator’s broader effort to bring its program back to being fully in-person post-COVID. Y Combinator started doing so in Summer 2022, though its demo days have still been remote. (YC says its next Demo Day will be partially in-person at Pier 70, though presentations will still be online)
Tan doesn’t just want founders back in the Bay Area: He wants them very close by. The Y Combinator CEO says the accelerator is highly encouraging founders to get places in the Dogpatch or Potrero Hill, or at least nearby. “I think it’s actually going to be really good that people know to be in walking distance of each other, and I think the connections between the founders are just going to be that much stronger,” he said, noting that he hopes Y Combinator
CASEY NEWTON, JAN 8, 2024

Substack is removing some publications that express support for Nazis, the company said today. The company said this did not represent a reversal of its previous stance, but rather the result of reconsidering how it interprets its existing policies.
As part of the move, the company is also terminating the accounts of several publications that endorse Nazi ideology and that Platformer flagged to the company for review last week.
The company will not change the text of its content policy, it says, and its new policy interpretation will not include proactively removing content related to neo-Nazis and far-right extremism. But Substack will continue to remove any material that includes “credible threats of physical harm,” it said.
In a statement, Substack’s co-founders told Platformer:
If and when we become aware of other content that violates our guidelines, we will take appropriate action.
Relatedly, we’ve heard your feedback about Substack’s content moderation approach, and we understand your concerns and those of some other writers on the platform. We sincerely regret how this controversy has affected writers on Substack.
We appreciate the input from everyone. Writers are the backbone of Substack and we take this feedback very seriously. We are actively working on more reporting tools that can be used to flag content that potentially violates our guidelines, and we will continue working on tools for user moderation so Substack users can set and refine the terms of their own experience on the platform.
Substack’s statement comes after weeks of controversy related to the company’s mostly laissez-faire approach to content moderation.
In November, Jonathan M. Katz published an article in The Atlantic titled “Substack Has a Nazi Problem.” In it, he reported that he had identified at least 16 newsletters that depicted overt Nazi symbols, and dozens more devoted to far-right extremism.
Last month, 247 Substack writers issued an open letter asking the company to clarify its policies. The company responded on December 21, when Substack co-founder published a blog post arguing that “censorship” of Nazi publications would only make extremism worse.
McKenzie also wrote that “we don’t like Nazis either” and said Substack wished “no-one held those views.” But “we don't think that censorship (including through demonetizing publications) makes the problem go away,” he wrote. “In fact, it makes it worse. We believe that supporting individual rights and civil liberties while subjecting ideas to open discourse is the best way to strip bad ideas of their power.”
The statement seemed to be at odds with Substack’s published content guidelines, which state that “Substack cannot be used to publish content or fund initiatives that incite violence based on protected classes.”
In its aftermath, several publications left the platform. Others, including Platformer, said they would leave if the company did not remove pro-Nazi publications.
Meanwhile, more than 100 other Substack writers, including prominent names like Bari Weiss and Richard Dawkins, signed a post from writer Elle Griffin calling on Substack to continue with its mostly hands-off approach to platform-level moderation.
From its inception, McKenzie and Substack co-founder Chris Best have touted freedom of speech as one of Substack’s core virtues. As a result, the platform has been embraced by fringe thinkers, who have built large businesses while promoting anti-vaccine pseudo-science, Covid conspiracy theories and other material that is generally restricted on mainstream social networks.
Substack has defended its approach by arguing that it is built differently from social networks, which optimize for engagement rather than subscription revenue. The company says it employs a “decentralized” approach to moderation that allows individual readers to decide which writers they want to subscribe to; and lets writers determine which comments they will allow and which blogs they will recommend.
(Incidentally, this approach means that you can’t currently report comments directly to Substack: only writers receive your reports. Platformer has reviewed several cases of violent material and death threats in Substack comments.)
At the same time, over the past couple years Substack has come to more closely resemble the social networks it often criticizes. Each week, Substack sends users a personalized, algorithmically ranked digest of posts from writers they don’t yet follow — a feature that can help fringe publications build larger audiences and make more money than they would otherwise.
And last year Substack launched Notes, a text-based social feed similar to Twitter that also surfaces personalized content in a ranked feed. Notes can also give heightened visibility and free promotion to extremists.
The question now is whether taking action against some pro-Nazi accounts will shift the perception that Substack is a home for the most extreme ideologies, and prevent an exodus among writers who prefer more aggressive content moderation.
In recent weeks, Platformer has worked with other journalists and extremism researchers in an effort to understand the scope of far-right content on the platform. We’ve now reviewed dozens of active, monetized publications that advance violent ideologies, including anti-Semitism and the great replacement theory.
Substack has argued that extremist publications represent only a small fraction of newsletters on the platform, and as far as we can tell this is true. At the same time, the site’s recommendations and social networking infrastructure is designed to enable individual publications to grow quickly. And the company’s outspoken embrace of fringe viewpoints all but ensures that the number of extremist publications on the platform will grow.
The company is now in a difficult position. Having branded itself as a bastion of free speech, any changes to its content policy risks driving away writers who chose the platform in part for its rejection of aggressive content moderation. At the same time, other publications — Platformer included — have lost scores of paying customers who do not want to contribute to a platform that they see as advancing the cause of extremism.
In coming days, explicitly Nazi publications on Substack are slated to disappear. But the greater divide within its user base over content moderation will remain. The next time the company has a content moderation controversy — and it will — expect these tensions to surface again.
Substack’s removal of Nazi publications resolves the primary concern we identified here last week. At the same time, as noted above, this issue has raised concerns that go beyond the small group of publications that violate the company’s existing policy guidelines.
JAN 12, 2024
At the end of November, an article by Jonathan Katz appeared at The Atlantic, with the foreboding title “Substack has a Nazi problem”. (It seems more portentous with the original Random Headline Capitals, but you’re at a British English publication now, so suck up the lowercase.) Katz began:
The newsletter-hosting site Substack advertises itself as the last, best hope for civility on the internet—and aspires to a bigger role in politics in 2024. But just beneath the surface, the platform has become a home and propagator of white supremacy and anti-Semitism. Substack has not only been hosting writers who post overtly Nazi rhetoric on the platform; it profits from many of them.
“Profits from many of them.” This is quite a big claim, and you need to read pretty closely to see whether Katz manages to stand it up.
An informal search of the Substack website and of extremist Telegram channels that circulate Substack posts turns up scores of white-supremacist, neo-Confederate, and explicitly Nazi newsletters on Substack—many of them apparently started in the past year. These are, to be sure, a tiny fraction of the newsletters on a site that had more than 17,000 paid writers as of March…
…More (Charles has asked me to cut early so that you click through for the rest).
CASEY NEWTON, JAN 11, 2024

After much consideration, we have decided to move Platformer off of Substack. Over the next few days, the publication will migrate to a new website powered by the nonprofit, open-source publishing platform Ghost. If you already subscribe to Platformer and wish to continue receiving it, you don’t need to do anything: your account will be ported over to the new platform.
If all goes well, following the Martin Luther King Jr. holiday on Monday, you’ll receive the Tuesday edition of Platformer as normal. If you have any issues with your subscription after that, please let us know.
Today let’s talk about how we came to this decision, the debate over how platforms should moderate content, and why we think we’re better off elsewhere.
I.
When I launched Platformer on Substack in 2020, it was not in the belief that we would be here forever. Tech platforms come and go; in the meantime, they can also change in ways that make staying there impossible for the creators that rely on them. For this reason, I almost launched Platformer on a custom-built stack of services centered on WordPress, the way my inspiration Ben Thompson had done for Stratechery.
But Substack had some compelling advantages of its own. It was impressively fast and easy to set up. It paid to design Platformer’s logo. It offered me a year of healthcare subsidies, and ongoing legal support.
I also felt a personal connection to Substack’s co-founders, who believed that Platformer would succeed even before it had a name. They convinced me that I could thrive on their platform, and offered me a welcome boost in confidence as I considered leaving the best job I ever had to strike out on my own.
In the three years since, Substack has been a mostly happy home. Platformer has grown tremendously over that time, from around 24,000 free subscribers to more than 170,000 today. Our paid subscribers have allowed me to create new jobs in journalism. I’m proud of the work we do here.
Over that same period, Substack has faced occasional controversies over its laissez-faire approach to content moderation. The platform hosts a wide range of material I find distasteful and offensive. But for a time, the distribution of that material was limited to those who had signed up to receive it. In that respect, I did not view the decision to host Platformer on Substack as being substantially different from hosting it on, for example, GoDaddy.
But as I wrote earlier this week, Substack’s aspirations now go far beyond web hosting. It touts the value of its network of publications as a primary reason to use its product, and has built several tools to promote that network. It encourages writers to recommend other Substack publications. It sends out a weekly digest of publications for readers to consider subscribing to. And last year it launched a Twitter-like social network called Notes that highlights posts from around the network, regardless of whether you follow those writers or not.
Not all of you use these features. Some of you might not have seen them. But I can speak to their effectiveness: In 2023, we added more than 70,000 free subscribers. While I would love to credit that growth exclusively to our journalism and analysis, I believe we have seen firsthand how quickly and aggressively tools like these can grow a publication.
And if Substack can grow a publication like ours that quickly, it can grow other kinds of publications, too.
II.
In November, when Jonathan M. Katz published his article in The Atlantic about Nazis using Substack, it did not strike me as cause to immediately leave Substack. All platforms host problematic and harmful material; I assumed Substack would remove praise for Nazis under its existing policy that “Substack cannot be used to publish content or fund initiatives that incite violence based on protected classes.”
And so, after reading the open letter from 247 writers on the platform calling for clarity on the issue, I waited for a response.
The response, from Substack co-founder Hamish McKenzie, arrived on December 21. It stated that Substack would remove accounts if they made credible threats of violence but otherwise would not intervene. “We don't think that censorship (including through demonetizing publications) makes the problem go away — in fact, it makes it worse,” he wrote. “We believe that supporting individual rights and civil liberties while subjecting ideas to open discourse is the best way to strip bad ideas of their power.”
This was the moment where I started to think Platformer would need to leave Substack. I’m not aware of any major US consumer internet platform that does not explicitly ban praise for Nazi hate speech, much less one that welcomes them to set up shop and start selling subscriptions.
But suddenly, here we were.
I didn’t want to leave Substack without first getting my own sense of the problem. I reached out to journalists and experts in hate speech and asked them to share their own lists of Substack publications that, in their view, advanced extremist ideologies. With my colleagues Zoë Schiffer and Lindsey Choo, I reviewed them all and attempted to categorize them by size, ideology, and other characteristics.
In the end, we found seven that conveyed explicit support for 1930s German Nazis and called for violence against Jews, among other groups. Substack removed one before we sent it to them. The others we sent to the company in a spirit of inquiry: will you remove these clear-cut examples of pro-Nazi speech? The answer to that question was essential to helping us understand whether we could stay.
It was not, however, a comprehensive review of hate speech on the platform. And to my profound disappointment, before the company even acted on what we sent them, Substack shared the scope of our findings with another, friendlier publication on the platform, along with the information that these publications collectively had few subscribers and were not making money. (It later apologized to me for doing this.)
The point of this leak, I believe, was to make the entire discussion about hate speech on Nazis on Substack appear to be laughably small: a mountain made out of a molehill by bedwetting liberals.
To us, the six publications we had submitted had only ever been a question: would Substack, in the most clear-cut of all speech cases, do the bare minimum?
In the end, it did, in five out of six cases. As all of this unfolded, I spoke twice with Substack’s co-founders. And while they asked that those conversations be off the record, my understanding from our conversations — based on material they had shared with me in writing — was that in the future they would regard explicitly Nazi and pro-Holocaust material to be a violation of their existing policies.
But on Tuesday, when I wrote my story about the company’s decision to remove five publications, that language was missing from their statement. Instead, the company framed the entire discussion as having been about the handful of publications I had sent them for review.
I attempted to write a straightforward news story about all this, and wound up infuriating many readers. On the right, I faced criticism for making a fuss out of Substack hosting a handful of small Nazi publications. On the left, I faced even louder criticism for (in their view) appearing to celebrate and validate Substack’s removal of those same publications. (I wrote Tuesday that “Substack’s removal of Nazi publications resolves the primary concern we identified here last week.” I regret using that language. What I should have said was “Substack did the basic thing we asked it to,” and then emphasized that it did not address our larger concerns. Which I did go on to say, though not with the force that in hindsight I wish I had.)
I’m happy to take my lumps here. I just want to say again that to me, this was never about the fate of a few publications: it was about whether Substack would publicly commit to proactively removing pro-Nazi material. Up to the moment I published on Tuesday, I believed that the company planned to do this. But I no longer do.
From there, our next move seemed clear. But first I wanted to consult our readers, whose advice and support I have been so lucky to rely on over these past few years. Asking readers for their thoughts proved to be surprisingly controversial, especially in the Sidechannel Discord, where some of you wondered whether I was seeking a fig leaf of approval that we could use to justify staying here. But Platformer has as its readers some of the world’s smartest minds in content moderation and trust and safety — I sincerely wanted to get your thoughts before making a final decision.
Over the next 48 hours, the Platformer community raised a variety of sensible objections to how Substack had handled this issue. You pointed out that Substack had not changed its policy; that it did not commit explicitly to removing pro-Nazi material; that it seemed to be asking its own publications to serve as permanent volunteer moderators; and that in the meantime all of the hate speech on the platform remains eligible for promotion in Notes, its weekly email digest, and other algorithmically ranked surfaces.
In emails, comments, Substack Notes and callouts on social media, you’ve made your view clear: Platformer should leave Substack. We waited a day to announce our move as we finalized plans with Ghost and began our migration. But today we can say clearly that we agree with you.
Substack’s tools are designed to help publications grow quickly and make lots of money — money that is shared with Substack. That design demands responsible thinking about who will be promoted, and how.
The company’s defense boils down to the fact that nothing that bad has happened yet. But we have seen this movie before, from Alex Jones to anti-vaxxers to QAnon, and will not remain to watch it play out again.
III. Frequently asked questions about Substack and free speech
We’re still only talking about six newsletters. Aren’t you overreacting?
To be clear, there are a lot more than six bad publications on Substack: our analysis found dozens of far-right publications advocating for the great replacement theory and other violent ideologies.
…More
JAN 12, 2024

A fairly contrived effort to endlessly link the word Substack to the word Nazi has had some moderate success, unfortunately. Or at least enough success to have sparked an open letter republished on many individual Substacks calling on Substack to get rid of Nazis, a counter–open letter calling on it to maintain its liberal content-moderation standards, a statement from Substack co-founders Hamish McKenzie, Chris Best, and Jairaj Sethi explicitly stating that they do not plan to ban Nazis from the platform, a bunch of Substackers responding by leaving or threatening to leave if Substack doesn’t moderate the content it hosts more aggressively, and a spate of news coverage of all of the above.
This is all pretty odd given that Substack’s content guidelines are conspicuously written to hew quite closely to the First Amendment on matters of alleged hate speech and have been in place for more than two years. Plus, the site’s founders have been very consistent about their lack of interest in adopting a more conservative approach to speech on Substack, even when sticking to their guns has led to bad PR. Since it’s clear that on Substack, almost anything goes that doesn’t involve a credible threat of violence, no one should be surprised that unsavory types can set up shop here.1
Earlier this week I critiqued the reporting of Casey Newton, arguing that his work on the controversy for his publication Platformer was shoddy and misleading, and seemed designed to obscure key information from his readers. At the end of the day, after what Newton described as a rather comprehensive search for extremist content on Substack, he and his team sent the company a grand total of six publications they believed violated its standards, and Substack banished five of them while declining to actually change its written policies. The publications in question, Substack told Newton in a statement, had 100 active readers between them and none had paid subscriptions turned on. Newton quoted selectively from Substack’s response, in a manner that excluded the number of Substacks he had reported, their moribund nature, and their lack of paid readership. When I asked Newton why he had left out this information, his answer — that revealing how many Nazi publications his team reported to Substack would put him and his team at risk of harassment at the hands of the Nazi authors in question — didn’t really make sense, and he wouldn’t elaborate on it. (Newton has since announced Platformer is leaving Substack.)
In this post I’d like to focus mostly on the article that started this whole affair: Jonathan M. Katz’s late November piece in The Atlantic, “Substack Has a Nazi Problem.” It turns out Katz almost entirely fabricated what is perhaps his most damning anecdote about Substack’s approach to extremism. After I lay out, in detail, how he did this, I’ll explain how The Atlantic (and Katz) responded to my critique. Then I’ll close with a discussion of the difficulty of developing consistent content moderation guidelines, drawing on several Substack competitors’ deeply troubled attempts to do so.
ANALYSIS BY CHERYL KNIGHT
In an exclusive analysis by theCUBE Research, industry experts assess the breaking news that Hewlett Packard Enterprise Co. has confirmed its acquisition of Juniper Networks Inc. for approximately $14 billion.
The acquisition — HP’s largest since its Autonomy Ltd. purchase in 2011 prior to its split into HPE and HP Inc. — is poised to double the size of its networking business, making it a major contributor to HPE’s annual operating income. The deal is strategically aligned with HPE’s ambitions in the networking sector, leveraging Juniper’s advancements in artificial intelligence, particularly through its Mist AI service, which enhances wireless access and network security, positioning HPE favorably in the burgeoning AI and cloud-native market spaces.
“From HPE’s standpoint, the marriage of HPE and Juniper makes a lot of sense,” said Dave Vellante (pictured, second from left), theCUBE Research analyst. “HPE’s got silicon chops going back to pre-split. If you think about HPE’s as-a-service portfolio, they have compute down with their service business, they’ve got Aruba, and now they’re adding in Juniper. They’ve got storage.”
Vellante and industry analyst John Furrier (left) spoke with Zeus Kerravala (middle), founder and principal analyst at ZK Research; Jake Kaldenbaugh (second from right), managing partner at CloudStrategies; and Steve Mullaney (right), former chief executive officer of Aviatrix, about the benefits and potential drawbacks of this major acquisition.
Is the acquisition a pure consolidation play to lower costs, raise revenue and increase industry leverage or an attempt to combine portfolios for true innovation opportunities?
HPE’s acquisition of Juniper Networks is less about pioneering uncharted technological territories and more about strengthening its current market position, according to Kaldenbaugh. It’s a consolidation play, aiming to unify and leverage the strengths of both companies, he added.
“This deal feels much more like a Broadcom-VMware play than it does somebody who’s reaching for the future,” he said.
The panelists also agreed that the pending deal is seen as a strategic effort to enhance HPE’s edge-to-cloud capabilities and compete more effectively against industry giants such as Cisco Systems Inc.
“If you look at the market and you look at the technology synergies between HPE and Juniper in this deal, particularly how their combined portfolios can put innovation back at the center of HPE’s strategy — specifically in AI and cloud-native environments — this acquisition is expected to double HPE’s networking business, creating a formidable position play against Cisco and others well,” Furrier said.
While the panel highlighted the potential for innovation, especially in AI and security assets, there’s also an acknowledgment that the acquisition could be viewed as a consolidation play, given the current trends in the industry. While the deal strengthens HPE’s portfolio, it’s crucial for HPE to not just focus on physical networking hardware, but also pivot toward software and cloud-based solutions, aligning with the industry’s shift from traditional hardware-centric approaches, according to Mullaney.
“The problem with what they’re doing it’s very much focused still on the physical world of networking, boxes,” he said. “It shifted from boxes to software and cloud about five years ago. They won’t have the growth until they actually start going after where the growth is, which is in the cloud. The growth will come when they really, truly understand that this is a cloud-centric, cloud-first kind of world.”
Also important in assessing the acquisition is the critical aspect of scale in the networking industry, according to Kerravala. The merger could forge a larger entity, better positioned to compete with dominant players such as Cisco. This perspective considers the historical challenges both HPE and Juniper have faced in growing their market share.
With stocks of both companies showing lateral movement, it’s clear that competing in a market led by a heavyweight such as Cisco is no small feat. In networking, where size significantly impacts a company’s ability to serve large, global clients, this merger could be a strategic move to create a more formidable competitor.
“If you combine the two companies together, you get in theory a much bigger company that can compete with Cisco,” noted Kerravala, encapsulating the potential of the deal to transform the competitive dynamics in the networking sector. This analysis underscores the merger’s rationale as a bid not just for growth, but for relevance and competitive parity in a challenging industry.
Here’s theCUBE’s complete video analysis:
January 4, 2024

A slower final quarter ended a lackluster year for global startup funding as venture capital investors continued to hold back in 2023, Crunchbase data shows.
In all, 2023 is on pace to be the lowest for venture funding since 2018. Global startup investment in 2023 reached $285 billion — marking a 38% decline year over year, down from the $462 billion invested in 2022.

Cutbacks were deep across all funding stages globally. Early-stage funding in 2023 was down more than 40% year over year, late stage by 37%, and seed just over 30%.
It’s worth keeping some perspective, though: Overall funding in 2023 was down by less than 20% when compared to the pre-pandemic years of 2018 to 2020.
Two years into the slowdown, the venture markets are still reckoning with the funding boom of 2021. The fall in tech stocks and a slowdown of the IPO market since the beginning of 2022 has tempered the industry. Valuations set in 2021 did not hold up in 2023, as promising companies raised flat and down rounds.
Startups last year navigated a tough funding environment, tightened their belts and focused on unit economics. Layoffs across tech deepened in 2023.
Investors deployed capital more sparingly, with a higher bar at each stage.
“You can get higher ownership as a fund than you could in 2021,” said Michael Cardamone of New York-based seed investor Forum Ventures. The current funding environment favors funds and is more difficult for startup founders, he said.
The U.S. — the largest startup investment market with about half of all venture funding — mirrored global trends. Funding to U.S.-based startups in 2023 totaled $138 billion, down by 37% year over year.
While most industries were down year over year, AI was the largest sector to show an increase. Global funding to AI startups reached close to $50 billion last year, up 9% from the $45.8 billion invested in 2022. The largest fundings in 2023 went to foundation model companies OpenAI, Anthropic and Inflection AI, which collectively raised $18 billion in 2023.
Semiconductors and battery tech also all saw increased investment in 2023.
Two industries, however, stood out as performing better than broader market declines. Manufacturing and cleantech startups were down in 2023 year over year, but by less than 20%.
Web3, which experienced a runup in 2021 and into 2022, fell 73% year over year in 2023, from $28 billion to $7.6 billion.
Other leading sectors that were down year over year include financial services (down over 50%), e-commerce and shopping (down 60%), and media and entertainment (down 64%).
Q4 marks the lowest quarter for global venture funding in 2023. Quarterly funding totaled $58 billion, down 24% quarter over quarter and 25% year over year.


Seed funding totaled $7 billion in Q4, down just over 20% year over year from $9 billion.
Despite the cutbacks at seed, it is seen to be the most robust funding stage with new companies funded. And as it became more challenging to raise a Series A round, companies were more likely to raise follow-on seed funding.

Early-stage funding declined the most in 2023 compared to other funding stages.
In the fourth quarter, early-stage funding totaled close to $23 billion, down a tad quarter over quarter, and down 32% year over year from $33 billion.

Late-stage funding in the fourth quarter was 25% of the volume of the peak in Q4 2021.
Fourth-quarter funding reached $28.6 billion, down close to 20% year over year.
Funding at this stage fluctuated throughout 2023 as large fundings went to AI, semiconductor, battery and clean energy companies.

With the increased number of companies funded in recent years, and the tightening funding markets, we expect the layoffs of 2023 will give way to more companies closing in 2024.
The venture markets got more disciplined in 2023. Without a bump in exits, 2024 will continue to be tough for founders in a funders market.
Share Dialog
Share Dialog
No activity yet