Just like any other modern industry, news publishing is gradually adopting automation powered by Artificial Intelligence (AI) — by its learning and language processing subfields, in particular. Struggling to monetize their content, publishers experiment with ad formats, diversifying revenue streams by introducing paid subscription, while striving to reduce production costs at the same time — and this is where the AI can do a trick.
The basic principle of AI lies in machine learning that allows computers to process vast amounts of data, and to learn from it without being specifically pre-programmed. First, machines have to rely on a set of rules in order to get sufficient knowledge of how a human would perform a particular task — and then the algorithm is set to go! Below are the most innovative ways of automating content production that are gaining momentum at the biggest news organizations right now:
- Automated reporting
- Reformatting of articles
- Text auto-tagging
- Content translation
- Content moderation
- Chat bots
- Content personalization
- Predictive analytics
- Image recognition and auto-tagging
If you’re a news publisher, there’s no need to hire journalists to cover tons of routine stories — an algorithm can do it for you for free with fewer errors and at better speed. The only requirement is to ‘feed’ a robot with clear structured data that can be parsed into ‘variables’. Some bigger news agencies like Associated Press stepped into generating automated content as early as 2014. The news giant then started producing automated stories on corporate financial results using the Wordsmith platform by Automated Insights. As Philana Patterson, the Assistant Business Editor at AP said at that time, the automation of financial quarterly reports freed up to 20% of editors’ time, so they could focus on other tasks.
By 2016, according to the report by Tow Center for Digital Journalism, leading publishers such as Forbes, ProPublica, The New York Times and Los Angeles Times also started to use AI for content production. However, the technology is still emerging and suited only for the topics where accuracy of data is more important than the quality of writing — i.e. financial reports or breaking news.
AP seem to be amongst the pioneers in this field as well. On average, their reporters used to re-write one article to fit several different channels — all manually. That’s why back in 2016, their internal team, in collaboration with a media startup accelerator Matter Ventures, started a new project — development of software that could automate the re-production of a story for all channels, whether for print or broadcast. First, they built a template upon which text for print was transformed into several variations of a copy for digital by shortening the wordage, making sentences more concise and numbers rounded. After a while, a self-learning algorithm, guided by an editor, managed to gain sufficient knowledge to produce multiple versions of the same text autonomously.
Creating a digital article, journalists normally have to either rely on the pre-programmed automated tagging available in CMS or add tags manually — the latter may end up as a total clutter. However, there are smarter alternatives such as “Editor,” a self-learning interface for text editing implemented by The New York Times that automatically tags text and creates annotation based on information gathered through a set of neural networks.
Most international news outlets strive to win a broader audience across countries and languages — this is where translation and adaptation of the content becomes a challenge. Despite the fact that automated translation software and SaaS like Google Translate have been out there for years, the style of the language and poor localization rarely meets high journalistic standards of the most respected news organizations.
EurActiv.com, a multilingual policy news website, has been experimenting with the automated content translation since its inception, and last year they started using an AI-powered technology by the Latvian company Tilde to streamline their processes. The system analyzes tens of thousands of uploaded stories and their human-made translations to learn the language the site uses and aligns it with the official style guide.
Evaluation and detection of spam, abusive or inappropriate content in the comment sections has been an issue tackled mostly manually by some bigger news media outlets. Before February 2017, NYT’s staff moderators had to examine around 11,000 comments posted to 10% of their open articles daily. However, they have introduced a self-learning algorithm called Perspective that can weed out unreasonable comments on their website automatically. The tool developed by Jigsaw, a technology startup owned by Google’s parent company Alphabet, reviews new content and compares it with thousands of comments reviewed and labeled as ‘toxic’ by human moderators and then scores them accordingly.
With automated AI-powered moderation, NYT is planning to allow 80% of their online content to be commented on by the end of the year. Perspective’s API is also currently being used by a number of high-profile news media organizations and can be requested here: https://www.perspectiveapi.com.
Since early 2016, when Facebook introduced its platform for creating AI-based chatbots for Messenger, news outlets have received another solid alternative for their content distribution — if used wisely. Based on the tool called Wit, publishers can build intelligent bots to automate and personalize interaction with the users. A bot learns a human language and responds to some basic queries like delivering the latest news on a specific topic. However, the adoption of Messenger bots by news outlets is not always smooth. Some have overwhelmed users with too many options or features that do not work correctly; while others dose information wisely so that it can substitute such channels as newsletters or app alerts and bring in even more readers (TechCrunch case).
Even 5 years ago the possibility to deliver selected content to a specific reader exactly at the right moment — it would not sound realistic. Now an AI-backed personalization via email makes it possible. It works as simple as that: while a user interacts with a website’s content, an intelligent algorithm learns one’s behavior, defines preferences, and the pages and topics with the highest engagement rate — and compiles a list of the most relevant links to be sent out in a newsletter, exactly when the user is most likely to open and click through to read the content on the website.
According to Boomtrain, a San Francisco-based startup specializing in AI-powered marketing personalization solutions, an average open rate for static emails in the media and publishing industry is around 19.24%, the same metric for personalized emails — increases up to 63.22%; the click rates are 13.16% versus 26.29% accordingly. The data is available in the “The State of AI-powered email marketing report 2017,” based on the research of nearly 235 million emails sent out by 65 companies, mainly from the news and publishing industry.
This sub-field of AI encompasses complex methods of current and historical data analysis required to make predictions on the future of readers’ behavior. By automating data forecasts, publishers can shape their monetizing strategy in the most effective way. For example, The New York Times has had a successful experience of employing data science and machine learning to increase subscription-based revenue.
According to their Chief Data Scientist Chris Wiggins, their internal team uses such tools as supervised, unsupervised and reinforcement learning to learn the ‘genome’ of loyal subscribers and better understand the funnel or reveal risks of cancelling individual subscriptions. Another way the machine learning can help publishers drive the subscriptions is to analyze which content is the most engaging and provide predictive insights to marketers who can promote the content most efficiently and with precise targeting. The API’s built by NYT’s data science team are open to the public and can be viewed here: developer.nytimes.com/docs.
Finding an optimal solution for the image storage that would allow faster and better file search is one of the challenges any publishing or a news organization faces inevitably. Another challenge is manual image tagging that involves hours of routine work and probability of a human error. On the other hand, if remain untagged, image archives are hard to navigate so that photo editors may have to re-purchase assets — which means even more waste of time and money for a news outlet. Furthermore, extensive image archives without meta-data lose their monetary value over time, mainly because a publishing company cannot monetize its original proprietary photo content.
Hopefully, this is where AI-powered digital assets management systems can help. Elvis DAM by WoodWing supports integration with APIs of the three biggest providers of the image-recognition algorithms: Google Vision, Amazon Image Rekognition and Clarifai. Each of these is self-taught to recognize physical objects (even such as faces and landmarks), distinguish their characteristics and add them as tags to the file’s metadata — which saves time and effort on search and streamlines the process of content creation considerably. A big Swedish magazine publisher Aller Media and their Portuguese colleagues Porto Editora are currently testing the image-recognition functionality within Elvis DAM.
To learn more about how AI-backed DAM can streamline your content creation processes, visit our Elvis DAM page or sign up for our 30-minute free webinar on Sept. 26th: “How Artificial Intelligence brings value to your existing archives”.