paint-brush
What Are General Search Engines?by@legalpdf
112 reads

What Are General Search Engines?

by Legal PDF: Tech Court CasesAugust 9th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

General search engines (GSEs) answer a wide range of user queries by searching the worldwide web. GSEs provide a “one-stop-shop,” which allows users to find relevant information for a broad set of needs. Specialized search engines such as Amazon or kayak.com “\[search\] a very narrow set of content”
featured image - What Are General Search Engines?
Legal PDF: Tech Court Cases HackerNoon profile picture

United States of America v. Google LLC., Court Filing, retrieved on April 30, 2024, is part of HackerNoon’s Legal PDF Series. You can jump to any part of this filing here. This part is 6 of 37.

A. General Search Engines

57. General search engines (GSEs) answer a wide range of user queries by searching the worldwide web. Tr. 2167:1–4 (Giannandrea (Apple)) (A GSE is “a tool that you use to search the worldwide web using queries.”); Tr. 182:4–13 (Varian (Google)) (“[A] [GSE] handles a wide variety of queries in different areas and provides search results that are relevant to those queries” from the web.); Tr. 3670:6–18 (Ramaswamy (Neeva)) (A GSE answers for the “vast majority” of a consumer’s information needs.).


58. GSEs provide a “one-stop-shop,” which allows users to find relevant information for a broad set of needs. Tr. 3670:19–23 (Ramaswamy (Neeva)) (A GSE is a “little bit of a one-stop-shop for all information needs.”); Tr. 184:24–185:1 (Varian (Google)) (agreeing that users can find information for a broad set of information needs on Google and users go to GSEs because they provide “convenience to [users]”); Des. Tr. 249:12–15; 249:20–250:15 (van Der Kooi (Microsoft) Dep.) (Tail quality is important because a “consumer . . . is most loyal to a product where all their needs are being met.”); UPX0343 at -845 (Google’s model is to provide good answers for all queries.); Tr. 4610:13–22 (Whinston (Pls. Expert)) (GSEs allow consumers to “both find things that they don’t know about, and navigate easily to things that they do know about.”).


59. A SVP differs from a GSE in that the specialized engine answers a narrow query set. Tr. 2168:20–2169:11 (Giannandrea (Apple)) (Specialized search engines such as Amazon or kayak.com “[search] a very narrow set of content.”); Des. Tr. 331:25–334:15 (Connell (Microsoft) Dep.) (describing differences between GSEs and specialized search engines and noting that queries on specialized search engines are “narrower”).


60. After a user enters a query, a GSE will provide results on a search engine results page (SERP). Tr. 183:9–12 (Varian (Google)); Des. Tr. 31:21–32:2 (Fox (Google) Dep.). A GSE SERP is the “broad answer from a [GSE].” Tr. 2221:14–19 (Giannandrea (Apple)). As illustrated in Figure 1 below, excerpted from a Google document, a SERP generally includes, among other search features, organic (non-paid) search results and can include paid search results (ads). UPX0001 at -532–36; Des. Tr. 20:12–23 (Jain (Google) Dep.); Des. Tr. 13:19–14:2, 14:5–23 (Moxley (Google) Dep.). Organic results are based on the average user and what is popular on the internet. Tr. 1329:7–25 (Dischler (Google)).


Figure 1: A Search Engine Results Page


61. Organic search results often include what have traditionally been referred to as the “ten blue links.” Tr. 2221:14–25 (Giannandrea (Apple)) (A query entered into a GSE provides a SERP with 10 blue links.); Tr. 1970:18–1971:5 (Weinberg (DuckDuckGo)) (Queries typically return 10 organic or “blue” links.). The 10 blue links take the user to websites the GSE deems most relevant to the user's query, based on the GSE's algorithms. Tr. 1970:18–1971:7 (Weinberg (DuckDuckGo)) (Traditional links, “[o]therwise referred to as kind of the 10 blue links[,] . . . are the links that people think about when they think of search engines...


And so I just mean like regular links to websites."); UPX8104 at -165 (“With the vast amount of information available, finding what you need would be nearly impossible without some help sorting through it. Google's ranking systems are designed to do just that: sort through hundreds of billions of webpages and other content in our Search index to present the most relevant, useful results in a fraction of a second.").


62. In addition to the 10 blue links, GSEs offer search features or content in response to a query. UPX0266 at -983, -985–86 (describing “[s]earch features” as one of “the key parts o[f] a modern search product”); Tr. 8222:3–8223:9 (Reid (Google)) (explaining how Google used structured data to provide additional information in response to a user’s query). Search features come from structured data—data that is obtained from a source other than crawling and indexing the web. Tr. 8222:3–8223:9 (Reid (Google)) (explaining structured data and how it differs from information that Google collects from the web).


Structured data may include sports scores, weather information, business information, and hotel prices. Id. (identifying game scores, hotel prices, and business hours as structured data); Tr. 2307:13–17 (Giannandrea (Apple)) (a search feature “would be something like a one box” beyond “just ten blue links”); UPX0520 at -813 (identifying weather one-box); UPX0001 at -533 (showing the knowledge graph); UPX0870 at -.004 (describing how Google displays web results alongside media news, knowledge panel, and other features) Des. Tr. 21:8–19, 21:21–22:5 (Moxley (Google) Dep.) (describing the knowledge graph as an example of structured data). An example of a structured data search feature is the onebox. Tr. 2307:13–17 (Giannandrea (Apple)).


63. On SERPs that include paid search results, paid results will typically display before the organic search results. Tr. 6523:16–6524:11 (Hurst (Expedia)); Des. Tr. 31:6–31:18 (Jain (Google) Dep.) (“[I]f you type in a query . . . there is a Search box at the top. There are some ad results. There are some organic results. There is a bunch of stuff at the bottom.”). The top result on a SERP is generally the one the user is going to rely on the most. Tr. 2230:17–21 (Giannandrea (Apple)).


64. A higher position on the SERP may influence how many clicks an ad gets. Des. Tr. 315:22–316:15, 315:18–317:10 (Fox (Google) Dep.).


1. How General Search Engines Work


65. A GSE assembles a SERP by either building its own general search services functionality or syndicating results from a third party. UPX8052 at .004 (explaining how Google produces its search results); Tr. 2210:8–19; 2212:1–8, 2221:14–2223:6 (Giannandrea (Apple)) (explaining the fundamentals of a GSE); Des. Tr. 45:13–46:120 ((Google-NF 30(b)(6) Dep.) (explaining how a query on a syndication partner like AOL “would generate a Results page on AOL” that was “powered by Google”).


66. Providing a modern GSE requires crawling the web, indexing the results, query understanding and refinement, retrieving information in response to a search query, ranking the web results, and whole page ranking to incorporate other search features. UPX0194 at -552 (“We first crawl the web to find information.


Then we organize this information in the form of an index.”); id. at -556 (“Understanding the meaning of a query is crucial to returning good answers.”); id. at -563 (information from the index is retrieved and scored); id. at -566 (web and non-web results are assembled into a SERP that is served to a user in response to their query); UPX0870 at -104.003–04 (Google crawls the web for content, processes the raw web data, and uses the processed web data to create a web index, then annotates the query, retrieves and ranks webpages relevant to the query, and shows the results to the user on a SERP.); UPX0204 at -241 (depicting process for crawling and indexing, receiving and interpreting queries, retrieving and scoring documents, ranking adjustments, and serving results).


a) Crawling


67. In preparation for responding to queries, GSEs “crawl” the web and log the information available on websites. Tr. 2206:7–20 (Giannandrea (Apple)) (crawling is “step one” to building a GSE); Tr. 10274:3–10275:13 (Oard (Pls. Expert)) (describing how a GSE crawls the web to build an index); UPX0266 at -983 (crawling is one of “the key parts o[f] a modern search product”). “Crawling the web means visiting webpages to find new and updated content and creating a copy of that content.” UPX0870 at .003.


To build a comprehensive and fresh index, GSEs must constantly crawl the web so as to log new webpages (also called websites or documents) and update pages on existing sites. Id. at .005; UPX9002.A at -724. As of April 2020, Google crawled 20 billion sites every day. UPX0001 at -531.


b) Indexing


68. GSEs “index” or organize crawled information. Tr. 2210:12–19 (Giannandrea (Apple)) (indexing converts results from a web crawl into a serving index); Tr. 1774:17–25 (Lehman (Google)) (information on the web is arranged and categorized to use in Google’s search product); UPX0266 at -983 (key components of a modern search product include “crawl/index”); Tr. 10274:3–10275:13 (Oard (Pls. Expert)) (explaining the importance of an index to a GSE); Des. Tr. 64:5–6, 8–18, 20–22, 64:24–65:6 (Ramalingam (Yahoo) Dep.) (For Yahoo to provide its own search results without the Microsoft partnership, it would need to crawl and index the web.).


A traditional search index, “[m]uch like the index you’d find in the back of a book,” lists terms paired with the crawled webpages on which they appear. UPX0870 at .010. Because a search index is huge, webpages are stored in an efficient format (broken up into small pieces) that allows the webpages to be returned in response to a user query. Id. at .010–11; Tr. 2656:6–18 (Parakhin (Microsoft)). Creating an index is a “very expensive proposition.” Tr. 1941:17–1942:10 (Weinberg (DuckDuckGo)).


c) Query Understanding And Refinement


69. Once a GSE has built an index, the GSE can search for information in response to a user’s query. UPX0870 at .013. A preliminary step in responding to a user’s query is query understanding and refinement. UPX0870 at .016–17. In this step, the search engine will take the “raw” query and parse it to better understand the user’s intent. Id.; UPX0266 at -984 (“Given a query, you want to find candidate docs quickly (the search part)[.] But first you need to understand if the query typed is the query that was intended[.] Spelling is the biggest problem here[.]


Followed by synonyms . . . .”). “Among the things that happen during query understanding and refinement are: “[s]pelling correction” and “[s]ynonym expansion.” UPX0870 at .016. “For example, the query [evening commute nj transit] may get refined to include the intent (traffic conditions), time (6pm), and the carrier (New Jersey Transit).” Id. at .017.


d) Retrieval


70. The GSE will then use the query to retrieve or identify relevant webpages from the index. Tr. 1776:1–5 (Lehman (Google)); UPX8102 at -160 (Google Search instantly matches searches “sort[ing] through hundreds of billions of webpages and other information in [Google’s] Search index to find the most relevant, useful results.”).


e) Ranking


71. After retrieval, the GSE will “rank” or sort retrieved websites according to how relevant they are to the query. Tr. 2268:17–2269:14 (Giannandrea (Apple)); UPX0266 at -984. Ranking retrieved webpages allows the GSE to determine which subset to display, and which get the higher placement. UPX0870 at .017–20.


f) Whole Page Ranking


72. Once all web results have been ranked, GSEs must determine which types of results it will show the user. UPX0869 at -866–67. The optimal SERP may require a combination web and non-web results. Id. at -867; UPX0870 at .019–20. For example, a query like [star wars] could result in a SERP with “the Wikipedia page, a KnowledgePanel about the movie series, videos of the latest trailers, or even a Local block with opening hours for the Disney theme park if the user is near Anaheim.” UPX0869 at -867.



Continue Reading Here.


About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.


This court case retrieved on April 30, 2024, storage.courtlistener is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.