Google Fonts tracking users? I research many web font service and their privacy policy

It's been many years since my Chinese web font problem notebook has been updated, but unexpectedly, I encountered a new topic today.
Basically, while eating and scrolling on my phone, I saw a report that bypasses cookie restrictions! Google was accused of using free fonts to track users. Apart from INSIDE (a Taiwan digital&tech media), several domestic technology media such as Cool3C have also reported on it. The content is roughly quoting foreign media reports, claiming that Google tracks users by providing specific free fonts for users to download and install, and by tracking their browsing behavior when accessing the Internet through their devices. Users can protect themselves by using the built-in features of the browser or installing additional extensions to block the download and display of web fonts.

After reading this news, I am speechless. Blocking web fonts can cause the web page to run incorrectly? Does Google Fonts track users?
It is so absurd that even AI would not dare to write it. Let me take some time to analyze the content of this news.

The Daily Mail is a low-quality media banned by Wikipedia

The article is not written by INSIDE itself, but rather states that 'the original article was published in the collaborative media mashdigi and has been authorized for reprint by INSIDE.' Continuing to trace the source, it is stated that 'according to a report by the Daily Mail.'

The original English article has a very sensational headline.

EXCLUSIVE: How Google is using FONTS to track what you do online and sell data to advertisers – and what YOU can do about it

Getting excited just because it's a foreign media outlet? Are all statements made by white people always right?

As soon as I saw the name Daily Mail, a notorious media outlet, I recalled a BBC report from 2017 Wikipedia: Daily Mail is 'not a reliable news source' - BBC News Chinese. It is well known that Wikipedia is a world-class online encyclopedia that theoretically anyone can edit, but it has strict review and usage regulations, and all content must be accompanied by sources. Wikipedia's internal editing team previously regarded Daily Mail as a notorious publication for fact-checking failures, sensationalism, and outright fabrication, and no longer cited Daily Mail as a source for Wikipedia news.

As of today, the name of the Daily Mail can still be found on Wikipedia's list of potentially unreliable sources, which can be accessed at Wikipedia:Potentially unreliable sources. Wikipedia has many quality control lists like this, including a list of fake news websites at Wikipedia:List of fake news websites.

When we see news from these media outlets, we need to think twice.

Can Google Fonts be used for advertising?

The original article highlights the information in the request headers (as summarized below) and claims that Google combines this information with personal data to gain a better understanding of users. Although the title mentions 'selling data to advertisers', the article does not provide any direct evidence linking the use of Google Fonts and ad delivery to such data.

Your IP address is your unique online identifier tied to your devices, every webpage you visited, how long you spent there, and the links you click on that page.

This gets lumped in with all the other data Google collects on you everywhere else. If you want to be shocked, these three creepy lists show everything the tech giant knows about you.

As a multinational corporation, Google has its own privacy policies, and the privacy and data collection policy for Google Fonts is explicitly stated:

For clarity, Google does not use any information collected by Google Fonts to create profiles of end users or for targeted advertising.

Technically, it's just the header data carried with the HTTP request

The original article exaggerates the situation. In fact, when users browse web pages and their web browsers request data from remote servers, the request header contains information that can be used to identify the user's device and browser. This applies not only to Google Fonts, but to any website or server, including e-commerce sites, government websites, and sweepstakes sites. The principle is the same. When users browse the Daily Mail website, the same information is also recorded. A previous post on this blog discussed this issue:Web traffic counter providers collect data to create market share charts for search engines also based on the same technological principle.

Here is a simple example:

The page Google Fonts privacy and data collection policyalso states it clearly.

When I embed Google Fonts in my website via the Google Fonts Web API, what data does Google receive from my website visitors?
When end users visit a website that embeds Google Fonts, their browsers send HTTP requests to the Google Fonts Web API. The Google Fonts Web API serves the Google Fonts Cascading Style Sheets (CSS) and subsequently the font files specified in the CSS to the users. Such HTTP requests include (1) the IP address used by the respective user to access the Internet, (2) the requested URL on the Google server, and (3) HTTP headers including the user agent describing the website visitors’ Internet browser and operating system versions as well as the referer (i.e. the webpage on which the Google font is to be displayed).

When I embed Google Fonts in my website via the Google Fonts Web API, why does Google receive the IP address of my website visitors?
The Internet Protocol requires IP addresses to transfer data via the Internet between a given client (i.e. browser) and a given server. This is why every client request to any server contains the client’s IP address so that the server can respond to that IP address. Accordingly, the fact that Google’s servers necessarily receive IP addresses to transmit fonts is not unique to Google and is consistent with how the Internet works.

Who would set their server to not keep connection logs?
The question is how long the logs will be kept, and whether anyone can access them at will.
And whether the data is really used for special purposes other than operational analysis?

Google has created a page on data collection policies for Google Fonts, which tells us that it's not as scary as we think, and also provides a glossary of terms. But people still prefer to read sensational technology media and criticize tech giants like 'Be Evil'.

Going back to the original article, it just feels like a lot of hype:

Private or incognito browsing won’t protect you from this tracking.

Uh-huh... Incognito mode handles other functions and doesn't directly change IP, UA, or other information. If you want to change those, you'll need to use other methods.

Google also gathers information like the user’s IP address and the website visited…
When you visit a site that uses Google Fonts, you automatically hand it over to Google…

Well... if you unplug the network cable and don't go online, none of this information will be recorded by any evil server, and this is the perfect solution.

The original text only contains this sentence fragment:

When the font file is downloaded from Google, more than just the font loads.
Google also gathers information like the user’s IP address and the website visited, which is, in turn, cross-referenced with other data the tech giant has about the user.

But in Chinese reports, it becomes 'Google provides specific free fonts for users to download and install'.

Downloading and 'installing' font files, as opposed to loading online fonts as discussed in the original article, are two completely different things that are both free services provided by Google Fonts. Downloaded and 'installed' fonts can record IP and browsing history, which could be an indication that the downloaded files contain malware or spyware and are obtained from some strange sources like a collection of 100 free fonts on some shady websites.

When using Google Fonts, does all text on the current page get sent back to the server?

It's common sense in computer science that the browser and web page program have to send all the text of the current web page to Google before they can get the font file to display beautiful fonts on the web browser, right?
In theory, this is possible, but rather than calling it a front-end technology issue, it should be seen as a problem of how each web font service implements it. Web technology professionals who often deal with web fonts and visual-related web technologies should be familiar with the technicalities of each web font service.

For example, Google Fonts mostly use unicode-range to separate the font subsets. It can be handled by the front-end browser alone without letting the backend server know what characters are on the webpage. Moreover, Google Fonts are publicly and freely available without requiring registration, and therefore, Google's font server does not need to implement as many authentication mechanisms as other font providers.

In simple terms, each character has a corresponding Unicode number, and Google Fonts cuts the font files into many small files. When the browser encounters text in a certain range, it loads the corresponding small file for that range. For example, the first small file of Noto Sans Tradtional Chinese is U+1f921-1f930, U+1f932-1f935, U+1f937-1f939, U+1f940-1f944, U+1f947-1f94a, U+1f950-1f95f, U+1f962-1f967, U+1f969-1f96a, U+1f980-1f981, U+1f984-1f98d, U+1f990-1f992, U+1f994-1f996, U+1f9c0, U+1f9d0, U+1f9d2, U+1f9d4, U+1f9d6, U+1f9d8, U+1f9da, U+1f9dc-1f9dd, U+1f9df-1f9e2, U+1f9e5-1f9e6, U+20024, U+20487, U+20779, U+20c41, U+20c78, U+20d71, U+20e98, U+20ef9, U+2107b, U+210c1, U+22c51, U+233b4, U+24a12, U+2512b, U+2546e, U+25683, U+267cc, U+269f2, U+27657, U+282e2, U+2898d, U+29d5a, U+f0001-f0005, U+f0019, U+f009b, U+f0101-f0104, U+f012b, U+f01ba, U+f01d6, U+f0209, U+f0217, U+f0223-f0224, U+fc355, U+fe327, U+fe517, U+feb97, U+fffb4;. It contains these glyphs:

As for how to split the unicode-range to achieve precise font display of the files that are included in the web page, this is where Google's expertise lies.

Unicode-range is not an obscure technology. If one's impression of web design is just dragging and dropping with templates and using AI-generated content, then it is true that they may never need to use it. One of its uses is when a webpage uses several fonts, each with Chinese and English characters, and the designer wants all Chinese characters to use one font, all English characters to use another font, and all numbers to use a third font for visual consistency. Instead of selecting each character with a mouse, designers can use CSS to set the unicode-range directly.

If you really don't like Google Fonts' web font hosting service and are concerned about leaving connection records on Google's servers, Google also provides a Self-hosted guide for website administrators to refer to. However, the guide does not include the techniques for slicing files into small pieces and splitting unicode-ranges. Without these techniques, users would have to download several tens of MB of Chinese font files when visiting a website for the first time. This would cause website traffic and page load times to explode.

Do other web font services send all text on the current page back to the server?

There are many providers of Web Fonts besides Google Fonts. Let's take a look at some of the more common Web Font services in Taiwan and assess them based on two criteria:
1. Whether they require sending all the text on the current webpage to their server before providing the font files.
2. Whether their servers collect and utilize user information for users browsing websites that use their web font services, and whether their attitude towards privacy policies is as transparent as Google's.

Adobe Fonts (Typekit)

First, let's take a look at the web font service of the notorious commercial giant, Adobe. Previously, Adobe introduced the use of dynamic subset on their service.

The value of the chunks parameter obtained through testing corresponds to a specific subset of characters. If the webpage text consists only of numbers or letters, the subset will be numbered 511. For other Chinese characters, it will correspond to a different subset. Because a subset may contain many characters, it may not be easy to deduce from the subset number whether a user is reading content such as erotic novels or not.

Adobe Fonts has its own privacy policy page Adobe policies - Adobe Fonts, which contains a standard statement on what data is collected and how it is used:

Adobe uses the information collected from websites using Adobe Fonts to provision the Adobe Fonts service and diagnose delivery or download problems. This information is also used to pay and fulfill our contracts with the font foundries whose fonts are used. We share aggregate reporting with font foundries and we may confirm to a font foundry that you have a valid license from Adobe, but we do not otherwise share your personal information with font foundries.

It seems that there is no direct mention of using the data for Adobe's advertising services on their privacy policy page for Adobe FontsAdobe policies – Adobe Fonts. Compared to other providers that we will examine later, Adobe Fonts' privacy level is slightly better.

justfont

This is a Taiwanese font design company that offers high-quality Chinese fonts. Their webfont service charges based on pageviews. Upon testing, it was found that they directly place all webpage text in Form Data and transmit all the text in the webpage to the server in plaintext, in its original order, without any omissions, in order to receive the font file for display on the browser.


(Testing was done using a lorem ipsum generator, some repeated text was not removed, and the text order is the same as the original paragraph.)

Regarding data collection and privacy purposes, we could not find any information:justfont member duty. Most of the information on the page of this Taiwanese font design company pertains to member functions (website administrators must register before embedding web fonts) and prohibitions against using pirated fonts. There is no clear explanation regarding whether the font server collects user information, unlike Google's explicit statement.

TypeSuqare

This webfont service also sends the webpage text to the font server, which is appended to https://wf.typesquare.com/3/tsst/dist/zh_tw/ts and sent back.
Although the requested data seems to be encoded and looks like gibberish, it is actually just base64 encoding, which can be easily decoded back to the original text by starting with a simple one or two characters.

MORISAWA user policymainly contains content related to membership. Regarding IP and the information mentioned above, it is stated that 'the Company uses the IP address for the purpose of calculating the usage of this service, assisting in diagnosing server issues, and managing this service' and 'the Company uses this data to ensure the proper operation of this service.'

DFO Web Font

The DFO too provide different implementation methods for various needs, such as when the page text rarely changes or when the page text is dynamically generated. The online demo appears to call the API at https://dfo.dynacw.com.tw/dfotw_ws/dfotw_ws.asmx/DFO_GetCSS_StyleTXT_Fixed and the displayed text is transmitted as plaintext in its original order in the Form Data, similar to JustFont.

DFO privacy policy is relatively simple, stating that Dynacw will not sell or rent any of your personal information to third parties. It is more aimed at registered members. There is not much emphasis on whether Dynacw will collect any user information when browsing external websites that use DFO Web font.

iFontCloud

After pasting the JS file path for referencing the web font, testing began. Similarly, the information was placed in the Form Data and sent back, with duplicate words first consolidated into one.

From the returned header, we can also see the server information of .NET Framework 2.0 and IIS 10.

iFontCloud privacy policy states that:

ARPHIC will not use this server to record and analyze 'individual' visitors.

Other web font service in China

I randomly checked a few websites and found that they usually transmit the text directly to the font service server when the webpage is loaded. Their privacy policies are generally standard, as follows:

2.2 During the use of our service, we will collect the following information:
2.2.1 Log Information: When you use our website's services, we will automatically collect your detailed usage situation as network logs, such as access date and time, IP address, terminal information, browser software information, etc.
2.2.2 We will collect, use, store, process and provide your user information in accordance with this agreement and the relevant product service agreement when providing you with business functions or specific services."

2.3.2 We may use user information statistics as the basis to irreversibly anonymize user information for designing, developing, and promoting brand new products and services. We will also analyze our service usage and may share this anonymous statistical information with the public or third parties;

Source:Han-Yi

Conclusion

It seems that the reports of Google Fonts using free fonts and web font services to track users are unfounded and are just low-quality content farming by some foreign media.

However, the operating mechanisms of some other web font services are more stimulating, as they are not as discreetly designed as one might imagine. There is a chance that someone with ill intent could parse from certain servers of the font service what IP address was used at what time on what device to browse which webpage and what text was in that webpage, including possibly even name, phone number, address, and email. Knowing which webpage (URL or Hostname) a visitor is viewing is one thing, and knowing what text is on the webpage is another. The former requires designing a web crawler program and may involve paying to register for access to certain pages, while the MEME can be more easily achieved.

(Reading jokes on a webpage with beautiful webfonts.)

UPDATE: Google Fonts & GDPR

Later, I saw a 2022 GDPR case reported in an external coal article, and there is also a Chinese report German Company Fined for Violating GDPR by Using Google Fonts. The gist of it is that a website from an EU member state used Google Fonts, and because of what is mentioned in this article and Google Fonts' Privacy Policy and Data Collection, Google Fonts will collect visitor IP information. Since the website did not provide a notice, it was sued for violating GDPR and won.

Collecting personal data requires various notifications, which are regulated in GDPR articles 13-14. Collecting but not notifying indeed violates the regulations. However, some foreign web design-related and web font management service providers use this article to make a big deal out of Google Fonts being non-compliant with GDPR, which seems like a slippery logic.

Judgment summary page has a sentence:

Der Einsatz von externen Schriftartendiensten kann nicht auf Art. 6 Abs. 1 S.1 lit. f DSGVO gestützt werden, da der Einsatz der Schriftarten auch möglich ist, ohne dass eine Verbindung von Besuchern zu externen Servern hergestellt werden muss. (The use of external font services cannot be justified under Art. 6 para. 1 lit. f GDPR, as the use of fonts is also possible without establishing a connection between visitors and external servers.)

If a website wants to take seriously about EU users and does not want to be sued because of using Google Fonts,
it should clearly inform visitors that IP data will be collected when accessing this website,
and when users click "disagree", WebFontLoader JS or LINK CSS should not load Google Fonts-related files,
which would make all the areas on the webpage where Google Fonts were used before switch to system default font or other alternative fonts set by CSS.

Later, I found a discussion thread where someone mentioned another web font service, Bunny Fonts, which is a free font hosting service provided by a foreign CDN, claims to be fully compatible with Google Fonts (they have the same fonts), and has a big banner on their website stating that they do not collect data and are GDPR compliant.

However, after testing it out, I found that Bunny Fonts does not use Unicode-range to slice fonts, so even if a webpage only uses 3 Chinese characters with a single font weight, the user's browser will directly load the entire few MB font file.

bunny font vs google font load size

Other Posts

近期熱門 Hot Posts

    ✏️

    Contact Me

    E-Mail

    Open Email Client

    LINE 私訊
    此為 LINE 官方帳號,僅用於連絡,不會群發訊息

    加 LINE 好友

    FB Messenger/Instagram 私訊

    FB Messenger IG 小盒子

    Telegram 私訊

    傳訊息到 Telegram