How natural language process can transform publisher SEO

With a history stretching back four centuries, Cologne’s DuMont publishing group began publishing prayer books in 1620 and published its first newspaper in Latin. 

Owned by the aristocratic DuMont family since 1805, the publishing group owns ten local newspapers in Germany including the historic Kölner Stadt-Anzeiger and the Cologne-based tabloid EXPRESS. 

The company has a long history of technological innovation, having adopted its first high-speed press in 1833.

The organisation adopted an ambitious AI-driven digitalisation strategy in 2021 across its newspaper sites, focusing on tagging and topic pages.

AI technology reads and categorises articles to boost SEO

The DuMont publishing group has interests in book publishing and marketing as well as newspaper publishing, and has ambitions to integrate artificial intelligence into many of its operations, with a dedicated ‘AI Circle’  within the company focusing on specific AI projects.


With newspapers in Frankfurt, Berlin and Saxony, as well as a huge readership in its native Cologne and Bonn region, the company hoped to use AI to drive circulation and reach across websites including Express.de and Ksta.de.

The company turned to metadata and taxonomy experts iMatrics to boost reach and SEO across the sites, using software which automatically reads and categorises articles.

The problem with manual SEO

Previously, the organisation had relied on staff manually entering keywords – which led to problems, says Alina Gerber, data scientist at the Kölner Stadt-Anzeiger Medien.

Gerber says: “As a data scientist, I had a lot of questions about, ‘What kind of topics do the readers like to read? What performs well?’ But we only had the categories that you would see in the navigation, very broad things like, the sports section or regional politics.”

The other problem was that people don’t tend to enter the same keywords, even when writing on the same subjects, Gerber says.

‘‘This was the major thing we were struggling with, because we had keywords, but they were manually entered – and people don’t just make mistakes, they enter the keywords differently every time. I learned that there were more than 100 ways to write FC Köln which is our local football club.”

The answer lay in artificial intelligence, and in particular natural language processing or NLP, Gerber said. 

Before approaching iMatrics, the company was already experimenting with using AI in the newsroom in different ways. 

DuMont had already experimented with using an AI-driven service to allow reporters to access data about readers, says Robert Zilz, head of data at the Kölner Stadt-Anzeiger Medien.

Zils said: “We have done a lot of data analytics and generated a lot of insights: we have a lot of users visiting our websites and we have a very deep understanding of them thanks to backtracking and cloud-native custom tracking solutions.

“I hoped to enable people to work in a more data-driven way, not in terms of like reading dashboards or reading reports, or asking the data team, but more being more self-reliant.”

Automatic article tagging drives reach, advertising and personalisation 

The whole idea of using natural language processing to understand and tag articles is relatively new.

Zilz hopes that by categorising articles using iMatrics’ technology DuMont can drive reach, boost advertising and personalise the site for users.

“We had this vision that we would be able to use NLP to get a better understanding of our article information. 

“It was a game-changer for us to have a natural language processing service in place inside the content management system where the editorial team are actually doing their daily work, without interrupting them.”

iMatrics tags articles automatically, managing metadata to create topic pages. 

Zilz says that doing this made the newspapers’ stories more visible to the ‘bots’ Google uses to create its search results. Google is one of the company’s major traffic sources, accounting for around 60% of site traffic.

“It was a big deal to look at our biggest traffic source, Google, and make sure we are doing our best, using the topic pages to gain trust and visibility. This is one of our core features.”

Zilz now hopes to build other applications using the tagged articles, particularly around the company’s live advertising service. 

With third-party publisher cookies set to be switched off on Google’s Chrome browser next year ability to serve contextual advertisements based on website data will become more important.

At express.de the topic pages on subjects ranging from Young Cologne to Boris Becker now now account for 10% of the site’s visibility on search.

Clicks on the topic pages rose 50% in one year, following the iMatrics integration, and drove more traffic across the whole site.

The company now hopes to work closely with iMatrics to build further applications based on article metadata.

Zilz says: “iMatrics are very much on the same level as us, and are super-stoked about stepping further into the world of AI.”

Email [email protected] to point out mistakes, provide story tips or send in a letter for publication on our “Letters Page” blog

Recommended For You

Leave a Reply