MEAP VERSION 15
Welcome
Thanks for purchasing the MEAP for AI-Powered Search!
This book teaches you the knowledge and skills you need to deliver highly-intelligent search applications that are able to automatically learn from every content update and user interaction, delivering continuously more relevant search results.
As you can imagine given that goal, this is not an introduction to search book. In order to get the most out of this book, you should ideally already be familiar with the core capabilities of modern search engines (inverted indices, relevance ranking, faceting, query parsing, text analysis, and so on) through experience with a technology like Apache Solr, Elasticsearch/OpenSearch, Vespa, or Apache Lucene. If you need to come up to speed quickly, Solr in Action (which I also wrote) provides you with all the search background necessary to dive head-first into AI-Powered Search.
Additionally, the code examples in this book are written in Python (and delivered in pre-configured Jupyter notebooks) to appeal both to engineers and data scientists. You dont need to be an expert in Python, but you should have some programming experience to be able to read and understand the examples.
Over my career, Ive had the opportunity to dive deep into search relevance, semantic search, recommendations, behavioral signals processing, learning to rank, dense vector search, and many other AI-powered search capabilities, publishing cutting-edge research in top journals and conferences and, more importantly, delivering working software at massive scale. As Founder of Searchkernel and as Lucidworks former Chief Algorithms Officer and SVP of Engineering, Ive also helped deliver many of these capabilities to hundreds of the most innovative companies in the world to help them power search experiences you probably use every single day.
Im thrilled to also have Doug Turnbull (Shopify) and Max Irwin (OpenSource Connections) as contributing authors on this book, pulling from their many years of hands-on experience helping companies and clients with search and relevance engineering. Doug is contributing chapters 10-12 about building machine-learned ranking models (Learning to Rank) and automating their training using click models, and Max is contributing chapters 13-14 on dense vector search, question answering, and the search frontier.
In this book, we distill our decades of combined experience into a practical guide to help you take your search applications to the next level. Youll discover how to enable your applications to continually learn to better understand your content, users, and domain in order to deliver optimally-relevant experiences with each and every user interaction. Were working steadily on the book, and readers should expect a new chapter to arrive about every 1 to 2 months.
By purchasing the MEAP of AI-Powered Search, you gain early access to written chapters, and well as the ability to provide input into what goes into the book as it is being written. If you have any comments or questions along the way, please direct them to Mannings for the book.
I would greatly appreciate your feedback and suggestions, as they will be invaluable toward making this book all it can be. Thanks again for purchasing the MEAP, thank you in advance for your input, and best wishes as you begin putting AI-Powered Search into practice!
Trey Grainger
1 Introducing AI-powered search
This chapter covers
- The need for AI-powered search
- The dimensions of user intent
- Foundational technologies for building AI-powered Search
- How AI-powered search works
The search box has rapidly become the default user interface for interacting with data in most modern applications. If you think of every major app or website you visit on a daily basis, one of the first things you likely do on each visit is type or speak a query in order to find the content or actions most relevant to you in that moment.
Even in scenarios where you are not explicitly searching, you may instead be consuming streams of content customized for your particular tastes and interests. Whether these be video recommendations, items for purchase, e-mails sorted by priority or recency, news articles, or other content, you are likely still looking at filtered or ranked results and given the option to either page through or explictly filter the content with your own query.
Whereas the phrase "search engine" to most people brings up thoughts of a website like Google, Baidu, or Bing, that enables queries based upon a crawl of the entire public internet, the reality is that search is now ubiquitous - it is a tool present and available in nearly all of our digital interactions every day across the numerous websites and applications we use.
Furthermore, while not too long ago the expected response from a search box may have been simply returning "ten blue links" - a list of ranked documents for a user to investigate to find further information in response to their query - expectations for the intelligence-level of search technologies have sky-rocketed in recent years.
Todays search capabilities are expected to be:
- Domain-aware: understanding the entities, terminology, categories, and attributes of each specific use case and corpus of documents, not just leveraging generic statitistics on strings of text.
- Contextual & Personalized: able to take user context (location, last search, profile, previous interactions, user recommendations, and user classification), query context (other keywords, similar searches), and domain context (inventory, business rules, domain-specific terminology) in order to better understand user intent.
- Conversational: able to interact in natural language and guide users through a multi-step discovery process while learning and remembering relevant new information along the way.
- Multi-modal: able to resolve text queries, voice queries, search using images or video, or even monitor for events and send event-based pushed notifications.
- Intelligent: able to deliver predictive type-ahead, to understand what users mean (spelling correction, phrase and attribute detection, intent classification, conceptual searching, and so on) to deliver the right answers at the right time and to be constantly getting smarter.
- Assistive: moving beyond just delivery of links to delivering answers and available actions.
The goal of AI-powered search is to leverage automated learning techniques to deliver on these desired capabilities. While many organizations start with basic text search and spend many years trying to manually optimize synonyms lists, business rules, ontologies, field weights, and countless other aspects of their search configuration, some are beginning to realize that most of this process can actually be automated.
This book is an example-driven guide through the most applicable machine learning algorithms and techniques commonly leveraged to build intelligent search systems. Well not only walk through key concepts, but will also provide reusable code examples to cover data collection and processing techniques, as well as the self-learning query interpretation and relevance strategies employed to deliver AI-powered search capabilities across todays leading organizations - hopefully soon to include your own!