Generative AI and journalism: Hidden risks reshaping information

Generative AI tools are reshaping the information environment in ways most audiences never see. From the data that trains them to the labour that maintains them, their inner workings raise urgent questions for journalism and democratic accountability.

Our world is in the midst of a disruption triggered by the development of Artificial Intelligence (AI). Companies selling AI tools have become the most valuable corporations in modern times, worth trillions of dollars – more than the GDPs of most countries. They are becoming a pervasive influence on social, commercial, and political life, and shaking up industries.

The media industry is among those facing new kinds of challenges due to the rise of AI. The practice and delivery of journalism, which is a vital component for functioning and healthy democracies, is changing in ways that are not obvious to its consumers.

To understand the impact of AI on our information environment and its political consequences requires a basic understanding of what Generative AI is and how it works. We need to “lift the bonnet” on what will increasingly power the information we receive and consume.

READ I IT Rules 2025 test balance between safety and rights

Data: The engine powering generative AI

The development of Generative AI begins with collecting vast amounts of data – including text, images, videos, and sounds – by crawling and scraping the internet. Everything from journalism, academic outputs, the public web, and text chats is gathered as data. This is bolstered by compilations of literature accessed, not always legally, through commercial licensing arrangements with media repositories.

The legitimacy of these forms of data collection is still unclear and has led to high-profile copyright and privacy litigation around the world. It has also triggered policy and regulatory debates about the legal conditions for accessing data, and loud complaints from creatives whose labour has become the basis of the vast revenues of the new multinational AI tech firms.

For these AI technologies, access to data itself is not enough. The data has to be converted into training datasets that involve a range of different kinds of computational processes and human labour. To make data meaningful in AI training, data workers have to label, clean, tag, annotate and process images and text, creating semantic links that enable GenAI models to produce meaningful responses to user ‘prompts’. Much of this data work is outsourced to lower-cost countries such as Kenya, India and China where workers are paid low wages and face poor labour standards. Those datasets are then used to train AI models through the process of machine learning.

Lifting the veil on how generative AI works

Machines don’t learn like humans do. What we call ‘machine learning’ is essentially a process of statistical pattern recognition. While there are many differing approaches to model training, in most cases it involves successive adjustments to vast numbers of internal values. This process is iterative, meaning the training repeats until the predictions are sufficiently close to the expected results.

Once trained, models like those that power ChatGPT, when prompted to, for example, “write a short news story on inflation figures,” can generate what is known as a sequence of tokens (word fragments) that statistically resemble similar stories seen during model training.

Critically, systems such as ChatGPT do not understand the world they depict or describe. They do not possess semantic knowledge, meaning they can’t understand facts or concepts such as what “inflation” means or what a “street protest” looks like. Instead, the machines are pattern-modelling engines that predict what content would most plausibly complete or correspond to a given prompt. In sum, the AI output is simply a function of scale and training data – not comprehension.

What does generative AI mean for journalism?

The predictive capacity that makes generative AI powerful also makes it unreliable. Prediction is not verification. These systems fill gaps with what sounds or looks right, not necessarily with what is right.

The generative AI model can in seconds write with fluency, summarise lengthy reports, or rephrase complex passages. It can produce images of events that appear photorealistic. But those outputs are the products of machine predictive tools – not verification. When AI is trained on biased or incomplete data, it is known to “hallucinate” content that looks and sounds right, but is inaccurate or unreliable.

That distinction matters profoundly for journalism, which depends on truth and verification rather than plausibility.

For journalists and audiences alike, the risk lies in not being able to verify AI-generated content. As ever more AI-generated content is pushed into the information ecosystem without clear labelling or context, it contributes to a media environment where the difference between reporting and simulation, and between fact and fabrication, becomes increasingly difficult to discern.

Journalism’s future will depend on whether institutions can adapt to, and meaningfully govern, the use of AI. That means not only developing new editorial standards and verification practices, but also putting much greater effort into ensuring that the data, labour, and energy sustaining these systems is made visible and accountable.

The question is not whether AI will reshape journalism. It already has. The question is whether it is possible for democratic societies to prevent AI from undermining trust in public institutions.

For those of us concerned about where our information and journalism comes from (its provenance), our ability as humans to check and verify information cannot match the lightning speed of chatbots to spit out dodgy text, data and images. Unless we humans can develop protocols and methods to regain control, oversight and checks before sharing machine outputs, we face the further erosion of the bedrock of society – agreed facts to allow rational thinking and consequent behaviours.

Dr Jake Goldenfein is at Melbourne University’s Law School and an Chief Investigator in the ARC Centre of Excellence for Automated Decision-Making and Society. Dr Fan Yang is also at Melbourne University Law School and studies the effects of large-scale international digital technologies. Daniel Angus is Professor of Digital Communication and Director of QUT’s Digital Media Research Centre. The post AI in journalism and democracy: Can we rely on it? appeared first on 360.

Social

Menu

Generative AI and journalism: Hidden risks reshaping information

Data: The engine powering generative AI

Lifting the veil on how generative AI works

What does generative AI mean for journalism?