What is a Database Index?
What are indexes?
Indexes are a powerful tool used in the background of a database to speed up querying. Indexes power queries by providing a method to quickly lookup the requested data.
Simply put, an index is a pointer to data in a table. An index in a database is very similar to an index in the back of a book.
Why are indexes needed?
Imagine walking into the Library of Congress and being given the task to find a specific publishing within 10 minutes. Would you be able to complete this task within the given time frame? The Library of Congress is considered the largest library in the world and it houses approximately 170 million items. Now, the Library of Congress is not a regular library where the public can check out books at will, but if you are like us, you know the challenge should not be too difficult. In fact, the first thing we would do is ask for access to the library’s index because indexes contain all the necessary information needed to access items quickly and efficiently.
In the same manner, a database index contains all the necessary information to access data quickly and efficiently. In today’s society, the business of data is rapidly advancing. In fact, some tech giants process several hundred petabytes (1000⁵ bytes) of data per day. Storing all of this data in a database is great, but for a data company, being able to access that data efficiently is paramount to success. Just like the Library of Congress example, one way of solving the access issue when it comes to large amounts of data is through the use of indexes. Indexes serve as lookup tables that efficiently store data for quicker retrieval.
How are indexes created?
In a database, data is stored in rows which are organized into tables. Each row has a unique key which distinguishes it from all other rows and those keys are stored in an index for quick retrieval.
Since keys are stored in indexes, each time a new row with a unique key is added, the index is automatically updated. However, sometimes we need to be able to quickly lookup data that is not stored as a key. For example, we may need to quickly lookup customers by telephone number. It would not be a good idea to use a unique constraint because we can have multiple customers with the same phone number. In these cases, we can create our own indexes.
The syntax for creating an index will vary depending on the database. However, the syntax typically includes a
CREATE keyword followed by the
INDEX keyword and the name we’d like to use for the index. Next should come the
ON keyword followed by the name of the table that has the data we’d like to quickly access. Finally, the last part of the statement should be the name(s) of the columns to be indexed.
CREATE INDEX <index_name>ON <table_name> (column1, column2, ...)
For example, if we would like to index phone numbers from a
customers table, we could use the following statement:
CREATE INDEX customers_by_phoneON customers (phone_number)
The users cannot see the indexes, they are just used to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So, only create indexes on columns that will be frequently searched against.
This article was designed to give you an overview of database indexes, including how:
- Indexes are a powerful tool used in the background of a database to speed up querying.
- Indexes contain all the necessary information needed to access items quickly and efficiently.
- Indexes serve as lookup tables to efficiently store data for quicker retrieval.
- Table keys are stored in indexes.
- Indexes for non-key values can be created with a
*We will not be using indexes in this course, but it’s good to know about them. 👩🏫