This article originally appeared in Japanese in Foresight.
Amid the deepening military confrontation between the United States and China, the “battle of data” is becoming the focus of the conflict. In the modern age of information and communications technology, and in future wars, the quality and quantity of data obtained to allow artificial intelligence (AI) to perform machine learning will be key, and the ability to process that large volume of data will determine who wins or loses.
Explosive Volume of Data in War
During the wars in Afghanistan and Iraq in the 2000s, the US military reportedly suffered from information overload. At the time, large amounts of data were collected at US military headquarters in Kuwait and Qatar from satellites, aircraft, various radars and sensors, and field units. However, the data collected at the command center was varied, complex, and contained duplication, ambiguity, and inconsistencies. AI was not developed at the time and could not process and convert these large amounts of data into useful operational data.
Compared to then, the volume of data in warfare today is exploding. For example, the resolution of some imaging satellites has improved to the point where road signs and road conditions can be identified. In addition, some satellites are capable of detecting the height of terrain and structures. Constellations of many small satellites pass over the same location several times a day and can detect short-time changes that were previously difficult to detect.
Thus, the amount of data obtained from imaging satellites, for example, is expanding in all four dimensions, including temporal changes. For example, as of 2018, US intelligence agencies had captured more than three seasons' worth of high-resolution images of all National Football League games (272 games) in a single day, using sensors deployed in combat areas around the world.
Military Use of Large-Language Models Attracting Attention
AI is essential for processing such vast amounts of data. At the meeting of the Association of the United States Army (AUSA) in Huntsville, Alabama, on March 28-30, US Army leaders were unanimous in stating that future warfare will be “Data Centric” and that the focus of the US-China confrontation would be the “battle of data.” They then went on to describe the importance of using AI as a technology for processing data.
AI can be used to automatically process large amounts of data that cannot be processed by humans. Furthermore, by deep learning the combination of heterogeneous data sets, valuable information can be extracted that would otherwise go unnoticed by humans. The system of creating new value from the combination of data that would not be connected in the past through deep learning, which excels in non-linear processes, is known as “data-driven” and is being used in business, while the US military is also focusing on its military applications.
The concept of Decision Centric Warfare (DCW), currently under development by the US Department of Defense, aims to achieve superior decision-making speed through the use of AI and unmanned weapons. Joint All-Domain Command and Control (JADC2), also being developed by the US Department of Defense, uses AI to process data collected by numerous sensors to support commanders' decision-making.
In recent months, there has been a rapid increase in discussion in the US focusing on military applications of large-language models such as chat GPT. Large-language models are adept at using vast data sets to synthesize information and answer questions. The success of large-language models has shown that AI can exceed human performance in areas previously unimagined by anyone.
The success of large-language models has shown the potential for AI to be used in a new and unprecedented area of planning military operations. In addition, while attention has been focused on deep fakes that generate false images and videos, the success of large-language models shows the possibility of mass production of fake news articles at a low cost.
The Chinese People's Liberation Army is also promoting the military use of AI based on the concept of Intelligentized Warfare, which was raised in 2019. The details of the discussions in China are not clear, but it is said that AI is being considered for processing information gathered by networks of unmanned weapons and undersea sensors in the waters around China. The Chinese People's Liberation Army is also discussing the possibility of using large-language models for planning military operations.
The Battle of Data in the Russo-Ukrainian War
This use of AI and data warfare is already taking place in the Russo-Ukraine War. The Russo-Ukrainian War is said to be the first war in which both sides use AI, particularly machine learning and deep learning algorithms. The creation of a fake deep-fake video of President Volodymyr Zelensky calling on people to stop fighting and surrender, and its spread on social media, is an example. Ukraine is also using facial recognition technology to identify Russian agents and soldiers for operational purposes.
On October 12, 2022, a Russian soldier posted a selfie on social media. A Ukrainian military research company spent several hours analyzing the image and identifying the location where it was taken. Two days later, an unexplained “explosion” was confirmed at that location. There have also been reported cases of Ukrainian hackers setting up fake Facebook accounts of attractive women to convince Russian soldiers to send photos. The hackers then located the Russians based on the photos, and the Ukrainian military shelled the location.
In addition, the Ukrainian Government has created a mechanism for Ukrainian citizens to provide information to the government using an official government app. Ukrainian citizens use the app to provide the government with evidence of Russian military movements and illegal activities, and the Ukrainian military uses this information in its operations. This is a new approach to utilizing the vast amount of data voluntarily provided by individuals, similar to the currently popular Chat GPT developed by US start-up Open AI.
Uniformity of Data Standards, Collection, and Contamination of Data
Important in such data warfare is the unification of data standards, data collection concerning the opposing nation, and data contamination.
In modern warfare, data collected by numerous radars, sensors, etc. must be shared by a large number of weapon systems. But to achieve this, the data standards of diverse weapon systems produced by different manufacturers must be unified. In addition, modern warfare requires integrated warfare by land, sea, air forces, space, cyber and electromagnetic forces, as well as cooperation with allies. Unification of data standards between such military types and allies is also important. If this is inadequate, the data received will have to be deciphered by humans and manually re-entered to conform to the standards of other systems, which will create bottlenecks in the conduct of operations.
Then there is the need to collect data on an adversary country. In previous wars, information such as the running speed of tanks or the frequency of communication equipment was considered valuable, and intelligence agencies collected such information and used it for operational planning. However, such catalog information will not be useful in data warfare in the future. Rather, it is necessary to routinely collect large amounts of raw data, such as images taken of tanks from above and radio waves actually emitted by communications equipment from the adversary country and have AI learn from this. This will make it easier to identify the tanks of the adversary country in wartime, for example, from images taken by satellites.
Such data collection battles are taking place in peacetime. Both the US and China are using a large number of satellites to acquire data on the military of both sides. In February 2023, the US Air Force shot down a Chinese balloon that violated US airspace. The balloon was equipped with sensors that collected electronic signals at close range, which are difficult to collect with satellites.
Contrary to this, it is also important to prevent the collection of data from one's own country. However, it is not easy to prevent data collection by satellites and other means. For this reason, there is a concept of data contamination, which uses data collection against the other side. This involves allowing the adversary country to collect the wrong data and forcing its AI to learn incorrectly. For example, when using equipment that emits radio waves, radio waves with different characteristics and frequencies could be used in routine training and in warfare. It is also possible that weapons could be constantly disguised during training and storage at bases so that the adversary country's satellites would collect incorrect image data.
Thus, the focus of future military confrontations between the US and China is the battle of data. Therefore, access to large amounts of high-quality data, and the ability to process large amounts of data using AI, will determine who wins the war.