More data is recorded on digital media every day than all the information stored since the beginning of mankind through 1970. According to a Science research article, if all the bytes (a unit of digital information that consists of eight binary bits) were stored on compact discs and stacked up, the tower of discs would extend beyond the Moon (more than 238,900 miles or 384,400 kilometers). And this is only the data stored between 1986 and 2007. The Covid-19 pandemic increased the use of digital technology by 400%, and by the end of 2030, the stack of discs would extend all the way to Mars with the data generated in just one year: 1024 bytes. This data explosion is not just a physical storage problem, but also a scientific one. In 1964, standards bodies began adding prefixes to units of measurement to accommodate these unimaginably huge and tiny numbers. In November, four new prefixes were added to the International System of Units (SI): ronna (1027, symbol R), ronto (10-27, r), quetta (1030, Q) and quecto (10-30, q).
Every time boundaries of the microscopic, physical, biological or mathematical worlds are crossed, new numbers are needed. While the surge in the amount of digital data is the clearest example of the need for new magnitudes, the astronomical distances of space exploration and the infinitesimal mass of subatomic particles also demand new descriptors. Martin Hilbert, author of the Science article and a professor at the University of Southern California in the United States, explains the problem: “The DNA of a single human can contain about 300 times more information than is stored in all our existing technological devices.”
The rapid pace of new scientific discoveries and the speed at which existing limits are exceeded has led to the use of informal terms. The internet is littered with references to hellabytes and brontobytes (both 1027 bytes), unofficial terms and symbols (‘h’ and ‘b’, respectively) that can add confusion to research studies that use ‘h’ for hecto (102) and ‘H’ for henry (the unit of electrical inductance). Also, ‘b’ is used for a barn unit of area (10-28 m2), and ‘B’ is used for a belio, a unit of sound intensity.
To rectify the situation, representatives from 100 countries met at the 27th meeting of the General Conference on Weights and Measures (CGPM, the acronym in French) in mid-November “to introduce four new prefixes to the International System of Units (SI) with immediate effect.” The new SI prefixes to be used for multiples and submultiples of units are the ronna, quetta, ronto and quecto. Now we can say the Earth’s mass is approximately six ronnagrams (5.975 trillion tons), and an electron’s mass is one quectogram.
The CGPM also resolved to reinforce “the essential role of the International System of Units (SI) in providing confidence in the accuracy and global comparability of measurements needed for international trade, manufacturing, human health and safety, protection of the environment, global climate studies and scientific research.” Moreover, the CGPM acknowledged the “scientific communities that depend on measurements that are not covered by the current range [such as] the needs of data science in the near future to express quantities of digital information using orders of magnitude in excess of 1024… as well as the importance of timely action to prevent unofficial prefix names being de facto adopted in other communities.”
The addition of measurement prefixes is fairly commonplace. The CGPM adopted peta and exa in 1975, and a few years later added zetta (1021), zepto (10-21), yotta (1024) and yocto (10-24). It also acknowledges that the main trigger for the addition of new prefix names is the growing requirements of data science and digital storage, which already use prefixes at the top of the existing range to express large amounts of digital information.
Richard Brown is the chief metrologist at the UK’s National Physical Laboratory and the driving force behind the new prefix names. “The prefix system has expanded over the years in response to advances in science and technology that require a wider range of orders of magnitude related to measurement,” said Brown. He presented a proposal for new prefix names at the CGPM meeting on November 17 after observing the widespread use of unofficial terms and studying alternatives for five years.
In a recent interview with Nature, Brown says he looked for words that began with the only letters not already in use as symbols for units or prefixes. He also wanted to stick to precedents introduced for the most recently added prefixes. For example, those that multiply figures, such as giga, end in ‘a’, whereas prefixes describing the smaller end of the scale, such as micro or nano, end in ‘o’.
Brown and the CGPM both agree that adopting the new prefix names was essential due to the demands of data science, the steady growth in data volume accelerated by widespread digitization, and the advent of new technologies, such as quantum computing. “These new prefixes,” said Brown, “will enable clear and unambiguous communication of these measurements for many years to come.”
The next problem will be to find new prefixes and symbols for magnitudes higher and lower than the ones recently approved. An easy solution is to use a numerical expression with a positive or negative exponent, or apply units like ‘kilo’ to create compound words like kiloquetta and kiloronna.
Brown thinks the adoption of new prefixes will take time, but rapid advances in information technology could shorten the time frame. According to Hilbert, “the fastest growing area of information processing is computing – its capacity has increased by 58% in just two decades.” According to an unreviewed paper by researchers from Epoch, an artificial intelligence (AI) research and forecasting organization, as researchers build more powerful models with greater capabilities, they have to find ever more texts to train them on. “Large language model researchers are increasingly concerned that they are going to run out of this sort of data,” said Teven Le Scao, a researcher at AI company Hugging Face, in an MIT Technology Review article. In other words, AI systems will need more and more information to be able to determine what is relevant and suitable for machine learning.