Improve Data Discovery, Ensure Data Governance, and Enable Innovation
Ole Olesen-Bagneux

#Data
#Data_Catalog
#Metadata
Combing the web is simple, but how do you search for data at work? It's difficult and time-consuming, and can sometimes seem impossible. This book introduces a practical solution: the data catalog. Data analysts, data scientists, and data engineers will learn how to create true data discovery in their organizations, making the catalog a key enabler for data-driven innovation and data governance.
Author Ole Olesen-Bagneux explains the benefits of implementing a data catalog. You'll learn how to organize data for your catalog, search for what you need, and manage data within the catalog. Written from a data management perspective and from a library and information science perspective, this book helps you:
Table of Contents
Part I. Organizing Data So You Can Search for It
Chapter 1. Introduction to Data Catalogs
Chapter 2. Organize Data: Design a Robust Architecture for Search
Chapter 3. Understand Search: Concepts, Features, and Mechanics
Chapter 4. Apply Search: From Simple to Advanced Patterns
Part II. Democratizing Data with a Data Catalog
Chapter 5. Discover Data: Empower End Users and Engage Stakeholders
Chapter 6. Access Data: The Keys to Successful Implementation
Chapter 7. Manage Data: Improve Lifecycle Management
Part Ill. Envisioning the Future of Data Catalogs
Chapter 8. Looking Ahead: The Company Search Engine and Improved Data Management
Who Should Read This Book
Although I had the epiphany about the potential of a data catalog and it was crystal clear in my head, I was then faced with the battle of explaining the features to the important stakeholders of my company. Although I knew they would see the benefits of this tool if they only took the time to understand it, they were simply not interested, nor did they have time to study it. I had to come up with a way to reach them.
I went back to the idea of using the data catalog as an enterprise search engine. So, I asked myself, “What are people searching for? What would a data scientist be searching for? A data protection officer? A chief information security officer?”
I decided to build demonstrations of the most vital data catalog features into small stories about specific stakeholders. Each slide deck had one central picture: a minimalist search bar with the company’s logo above it. I would explain the information need of a specific stakeholder, show the search in the search bar, reveal the result, then close with how the results could be used. In this way, I showed simple searches, complex searches, how to browse back and forth in the lineage of data, up and down in domains, and relationally in the graph that depicted our company. It had the same content as my previous demonstrations, but this time, it was explained from a stakeholder point of view: a specific person who was searching for something specific. And that worked.
The stakeholders not only got interested, but they also got excited. They now wanted the data catalog, because they understood that this tool was not just a collection of fancy features for data geeks. No, this tool was something way more fundamental: the data catalog could help them search and find the data they were looking for. I explained that, implemented with care, a data catalog has relevance for many of the employees in a company. This approach worked for me and my colleagues, and I hope that it will work for you and yours as well.
At the end of the day, we are all searching for something. And we search all the time. The only thing is, at work, it is very difficult to search for whatever we are trying to find. And we take that for granted, as something that we must just accept.
I’m assuming you’re reading this book because you’re involved with planning to implement a data catalog, improve an existing one, sunset it, or simply trying to understand what kind of technology a data catalog is: what it does, how it should be used, and if it can help you in a certain way. You might be part of the offices of the legal counsel, chief data officer, data protection officer, or chief information officer. You might be a data engineer, data scientist, or data manager, or you might be part of the data governance team. If you are, then this book will help you understand what a data catalog is and how it will enable you to find exactly what you are searching for.
However, you may also be a data catalog provider. In my book, I put forward a vision for the future of data catalogs, which you could benefit from when planning the future development of your data catalog technology.
This book is a much-needed and refreshing addition to the data catalog landscape. Ole masterfully combines industry and practical experience with information and library science concepts to provide data catalog implementers with essential techniques for delivering a superior data discovery experience for their organization.
Juan Sequeda, PhD | Prinicipal Scientist - data.world
For too long the information science and data management communities have been far apart. Dr. Olesen-Bagneux's groundbreaking work clearly demonstrates the vital necessity of bringing these communities together toward realizing the full potential of data assets. The fresh perspectives developed in his book show us the way forward for innovation both in practice and in the study of data systems that reflect the human context of data work.
George Fletcher, Professor in the Data and Artificial Intelligence Cluster at Eindhoven University of Technology
Ole Olesen-Bagneux's book helps enterprise organizations navigate the complex data landscape with higher precision than ever before. It highlights the tremendous opportunity we have to harvest the full value of investing in data through an enterprise data search engine. Ole provides the magical missing piece that enables data-driven organizations to reach their full potential. This book is a must-read for all IT professionals, data authorities, and data enthusiasts.
Ann Fogelgren, Chief Information Officer, GN Group
Ole Olesen-Bagneux holds a PhD in Information Science from the University of Copenhagen, Denmark, where he has also lectured in courses pivotal for data cataloging, such as Knowledge Organization and Information Retrieval, that teach you how to organize data in big collections and retrieve it again. He has worked within the field of Data Management and Governance as a leader, architect, and practitioner for over a decade in the life science sector. He has hands-on experience with several data catalogs, and currently works as an Enterprise Architect in GN Store Nord, in Copenhagen, Denmark.









