Computer Science

Hash Maps

Hash maps are data structures that store key-value pairs. They use a hash function to map keys to indices in an array, allowing for constant-time access to values. Hash maps are commonly used in programming for efficient data retrieval and storage.

Written by Perlego with AI-assistance

Related key terms

1 of 5

10 Key excerpts on "Hash Maps"

No longer available |Learn more
Data Structures and Program Design Using C
A Self-Teaching Introduction
- D. Malhotra, N. Malhotra(Authors)
- 2018(Publication Date)
- Mercury Learning and Information
  (Publisher)
Answer. Hashing is the process of mapping keys to their appropriate locations in the hash table. It is the most effective technique of searching the values in an array or in a hash table. 430 • DATA STRUCTURES AND PROGRAM DESIGN USING C 10.1.2 Hash Tables A hash table is a data structure which supports one of the efficient searching techniques, that is, hashing. A hash table is an array in which the data is accessed through a special index called a key. In a hash table, keys are mapped to the array positions by a hash function. A hash function is a function, or we can say that it is a mathematical formula, which when applied to a key, produces an integer which is used as an index to find a key in the hash table. Thus, a value stored in a hash table can be searched in O(1) time with the help of a hash function. The main idea behind a hash table is to establish a direct mapping between the keys and the indices of the array. FIGURE 10.3. Mapping of keys using a direct addressing method. HASHING • 431 10.1.3 Hash Functions A hash function is a mathematical formula which when applied to a key, produces an integer which is used as an index to find a key in the hash table. Characteristics of the Hash Function There are four main characteristics of hash functions which are: 1. The hash function uses all the input data. 2. The hash function must generate different hash values. 3. The hash value is fully determined by the data being hashed. 4. The hash function must distribute the keys uniformly across the entire hash table. FIGURE 10.4. Mapping of keys to the hash table using hashing. 432 • DATA STRUCTURES AND PROGRAM DESIGN USING C Different Types of Hash Functions In this section, we will discuss some of the common hash functions: 1. Division Method – In the division method, a key k is mapped into one of the m slots by taking the remainder of k divided by m.
Sign up to read
Learn more about book
eBook - ePub
Hands-On Data Structures and Algorithms with Python
- Dr. Basant Agarwal(Author)
- 2022(Publication Date)
- Packt Publishing
  (Publisher)
8 Hash Tables A hash table is a data structure that implements an associative array in which the data is stored by mapping the keys to the values as key-value pairs. In many applications, we mostly require different operations such as insert, search, and delete in a dictionary data structure. For example, a symbol table is a data structure based on a hash table that is used by the compiler. A compiler that translates a programming language maintains a symbol table in which keys are character strings that are mapped to the identifiers. In such situations, a hash table is an effective data structure since we can directly compute the index of the required record by applying a hash function to the key. So, instead of using the key as an array index directly, the array index is computed by applying the hash function to the key. It makes it very fast to access an element from any index from the hash table. The hash table uses the hashing function to compute the index of where the data item should be stored in the hash table. While looking up an element in the hash table, hashing of the key gives the index of the corresponding record in the table. Ideally, the hash function assigns a unique value to each of the keys; however, in practice, we may get hash collisions where the hash function generates the same index for more than one key. In this chapter, we will be discussing different techniques that deal with such collisions. In this chapter, we will discuss all the concepts related to these, including: Hashing methods and hash table techniques Different collision resolution techniques in hash tables Introducing hash tables As we know, arrays and lists store the data elements in sequence. As in an array, the data items are accessed by an index number. Accessing array elements using index numbers is fast. However, they are very inconvenient to use when it is required to access any element when we can’t remember the index number
Sign up to read
Learn more about book
eBook - PDF
A Textbook of Data Structures and Algorithms, Volume 3
Mastering Advanced Data Structures and Algorithm Design Strategies
- G. A. Vijayalakshmi Pai(Author)
- 2022(Publication Date)
- Wiley-ISTE
  (Publisher)
Random access is one in which the data elements of the dictionary are accessed according to no particular order. 2 A Textbook of Data Structures and Algorithms 3 Hash tables are ideal data structures for dictionaries. In this chapter, we introduce the concept of hashing and hash functions. The structure and operations of the hash tables are also discussed. The various methods of collision resolution, for example, linear open addressing and chaining and their performance analyses are detailed. Finally, the application of hash tables in the fields of compiler design, relational database query processing and file organization are discussed. 13.2. Hash table structure A hash function H(X) is a mathematical function which, when given a key X of the dictionary D maps it to a position P in a storage table termed hash table. The process of mapping the keys to their respective positions in the hash table is called hashing. Figure 13.1 illustrates a hash function. Figure 13.1. Hashing a key When the data elements of the dictionary are to be stored in the hash table, each key X i is mapped to a position P i in the hash table as determined by the value of H(X i ), that is, P i = H(X i ). To search for a key X in the hash table all that one does is determine the position P by computing P = H(X) and accessing the appropriate data element. In the case of insertion of a key X or its deletion, the position P in the hash table where the data element needs to be inserted or from where it is to be deleted respectively, is determined by computing P = H(X). If the hash table is implemented using a sequential data structure, for example, arrays, then the hash function H(X) may be so chosen to yield a value that corresponds to the index of the array. In such a case the hash function is a mere mapping of the keys to the array indices. Hash Tables 3 EXAMPLE 13.1.– Consider a set of distinct keys { AB12, VP99, RK32, CG45, KL78, OW31, ST65, EX44 } to be represented as a hash table.
Sign up to read
Learn more about book
eBook - ePub
Learning JavaScript Data Structures and Algorithms
- Loiane Groner(Author)
- 2018(Publication Date)
- Packt Publishing
  (Publisher)
Dictionaries and Hashes

In the previous chapter, we learned about sets. In this chapter, we will continue our discussion about data structures that store unique values (non-repeated values) using dictionaries and hashes.
In a set, we are interested in the value itself as the primary element. In a dictionary (or map), we store values in pairs as [key, value]. The same goes for hashes (they store values in pairs, such as [key, value]); however, the way that we implement these data structures is a little bit different as dictionaries can only store a single value per key, as we will learn in this chapter.
In this chapter, we will cover:

The dictionary data structure

The hash table data structure

Handling collisions in hash tables

The ECMAScript 2015 Map, WeakMap, and WeakSet classes

Passage contains an image

The dictionary data structure

As we have already learned, a set is a collection of distinct elements (non-repeated elements). A dictionary is used to store [key, value] pairs, where the key could be used to find a particular element. A dictionary is very similar to a set; a set stores a [key, key] collection of elements, and a dictionary stores a [key, value] collection of elements. A dictionary is also known as a map , symbol table , and an associative array .

In computer science, dictionaries are often used to store the reference address of objects. For example, if we open the Chrome | Developer tools
in the Memory tab and run a snapshot , we will be able to see some objects and their respective address references in the memory (represented by
@<number>). We can see this scenario in the following screenshot:
Sign up to read
Learn more about book
eBook - ePub
F# for Scientists
- Jon Harrop(Author)
- 2011(Publication Date)
- Wiley-Interscience
  (Publisher)
fold ) is effectively random. In fact, the order is related to the hash function.

Hash tables can clearly be useful in the context of scientific programming. However, a functional alternative to these imperative hash tables can sometimes be desirable. We shall now examine a functional data structure which uses different techniques to implement the same functionality of mapping keys to corresponding values.
3.6 MAPS Maps have the following properties:
- Immutable.
- Each key maps to a single value.
- Fast insertion, deletion and search.
We described the functional implementation of the set data structure provided by F# in section 3.4. The core F# library provides a similar data structure known simply as a map13 .

Much like a hash table, the map data structure associates keys with corresponding values. Consequently, the map data structure also provides add and remove functions to insert and delete mappings, respectively, and a find function which returns the value corresponding to a given key.

Unlike hash tables, maps are represented internally by a balanced binary tree, rather than an array, and maps differentiate between keys using a specified total ordering function, rather than a hash function.
Due to their design differences, maps have the following advantages over hash tables:
- Immutable: operations on maps derive new maps without invalidating old maps.
- Stable O (log n ) time-complexity for inserting and removing a mapping, compared to unstable, amortized Θ(1) time-complexity in the case of hash tables (which may take up to O (n ) for some insertions, as illustrated in figure 3.9 ).
- Maps are based upon comparison and hash tables are based upon hashing. Comparisons are easier to define and more readily available.
- If comparison is faster than hashing (e.g. for structurally large keys, where comparison can return early) then a map may be faster than a hash table.
Sign up to read
Learn more about book
eBook - ePub
Essential Algorithms
A Practical Approach to Computer Algorithms Using Python and C#
- Rod Stephens(Author)
- 2019(Publication Date)
- Wiley
  (Publisher)
dictionaries.

The process of mapping a key value for use by the hash table is called hashing. Good hashing functions spread out key values so that they don't all go to the same position in the table. In particular, key values are often similar, so a good hashing function maps similar key values to dissimilar locations in the table.

For example, suppose that you want to store customer records in a hash table and look them up by name. If two customers have the last names Richards and Richardson, ideally the hashing function should map them to two different locations.
To achieve this, hashing functions often generate a value that looks something like gibberish, as if the key value had been chopped into hash.
If you put enough values in a hash table, eventually you'll find two keys that hash to the same value. That's called a collision. When that occurs, you need a collision-resolution policy that determines what to do. Often the collision resolution policy maps the key to a series of new positions in the table until it finds an empty position.

A hash table's fill percentage, the percentage of the table that contains entries, influences the chance of collisions occurring. Adding a new key to a hash table is more likely to cause a collision if the table's data structure is 95 percent full than if it's only 10 percent full.
To summarize, a hash table needs the following:
- A data structure to hold the data
- A hashing function to map keys to locations in the data structure
- A collision-resolution policy that specifies what to do when keys collide
To be useful, a hash table must be able to at least add new items and locate items that were previously stored. Another feature that is useful but not provided by some hash tables is the ability to remove a hashed key.
Sign up to read
Learn more about book
eBook - PDF
Advanced Data Structures
- Peter Brass(Author)
- 2008(Publication Date)
- Cambridge University Press
  (Publisher)
9 Hash Tables Hash tables are a dictionary structure of great practical importance and can be very efficient. The underlying idea is quite simple: we have a universe U and want to store a set of objects with keys from U . We also have s buckets and a function h from U to S = {0, . . . , s − 1}. Then we store the object with key u in the h(u)th bucket. If several objects that we want to store are mapped to the same bucket, we have a collision between these objects. If there are no collisions, then we can realize the buckets just as an array, each array entry having space for one object. The theory of hash tables mainly deals with the questions of what to do about the collisions and how to choose the function h in such a way that the number of collisions is small. The idea of hash tables is quite old, apparently starting in several groups at IBM in 1953 (Knott 1972). For a long time the main reason for the popularity of hash tables was the simple implementation; the hash functions h were chosen ad hoc as some unintelligible way to map the large universe to the small array allocated for the table. It was the practical programmer’s dictionary structure of choice, easily written and conceptually understood, with no performance guarantees, and it still exists in this style in many texts aimed at that group. The development and analysis of hash table methods that are provably good in some sense started only in the 1980s, and now a well-designed hash table can indeed be a very efficient structure. 9.1 Basic Hash Tables and Collision Resolution If we map the keys of a big universe U to a small set S = {0, . . . , s − 1}, then it is unavoidable that many universe elements are mapped to the same element of S . In a dictionary structure, we do not have to store the entire universe, but only some set X ⊂ U of n keys for the objects currently in the dictionary. But if we 374
Sign up to read
Learn more about book
eBook - PDF
Data Structures and Algorithms in Python
- Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser(Authors)
- 2013(Publication Date)
- Wiley
  (Publisher)
In practice, hash tables are among the most efficient means for implementing a map, and it is essentially taken for granted by programmers that their core oper- ations run in constant time. Python’s dict class is implemented with hashing, and the Python interpreter relies on dictionaries to retrieve an object that is referenced by an identifier in a given namespace. (See Sections 1.10 and 2.5.) The basic com- mand c = a + b involves two calls to getitem in the dictionary for the local namespace to retrieve the values identified as a and b, and a call to setitem to store the result associated with name c in that namespace. In our own algorithm analysis, we simply presume that such dictionary operations run in constant time, independent of the number of entries in the namespace. (Admittedly, the number of entries in a typical namespace can almost surely be bounded by a constant.) In a 2003 academic paper [31], researchers discuss the possibility of exploiting a hash table’s worst-case performance to cause a denial-of-service (DoS) attack of Internet technologies. For many published algorithms that compute hash codes, they note that an attacker could precompute a very large number of moderate-length strings that all hash to the identical 32-bit hash code. (Recall that by any of the hashing schemes we describe, other than double hashing, if two keys are mapped to the same hash code, they will be inseparable in the collision resolution.) In late 2011, another team of researchers demonstrated an implementation of just such an attack [61]. Web servers allow a series of key-value parameters to be embedded in a URL using a syntax such as ?key1=val1&key2=val2&key3=val3. Typically, those key-value pairs are immediately stored in a map by the server, and a limit is placed on the length and number of such parameters presuming that storage time in the map will be linear in the number of entries.
Sign up to read
Learn more about book
eBook - PDF
Data Structures and Algorithms in Java
- Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser(Authors)
- 2014(Publication Date)
- Wiley
  (Publisher)
Ideally, keys will be well distributed in the range from 0 to N − 1 by a hash function, but in practice there may be two or more distinct keys that get mapped to the same index. As a result, we will conceptualize our table as a bucket array, as shown in Figure 10.4, in which each bucket may manage a collection of entries that are sent to a specific index by the hash function. (To save space, an empty bucket may be replaced by a null reference.) 0 1 2 3 4 5 6 7 8 9 10 (1,D) (25,C) (3,F) (14,Z) (39,C) (6,A) (7,Q) Figure 10.4: A bucket array of capacity 11 with entries (1,D), (25,C), (3,F), (14,Z), (6,A), (39,C), and (7,Q), using a simple hash function. 10.2. Hash Tables 411 10.2.1 Hash Functions The goal of a hash function, h, is to map each key k to an integer in the range [0, N − 1], where N is the capacity of the bucket array for a hash table. Equipped with such a hash function, h, the main idea of this approach is to use the hash function value, h(k), as an index into our bucket array, A, instead of the key k (which may not be appropriate for direct use as an index). That is, we store the entry (k, v) in the bucket A[h(k)]. If there are two or more keys with the same hash value, then two different entries will be mapped to the same bucket in A. In this case, we say that a collision has occurred. To be sure, there are ways of dealing with collisions, which we will discuss later, but the best strategy is to try to avoid them in the first place. We say that a hash function is “good” if it maps the keys in our map so as to sufficiently minimize collisions. For practical reasons, we also would like a hash function to be fast and easy to compute. It is common to view the evaluation of a hash function, h(k), as consisting of two portions—a hash code that maps a key k to an integer, and a compression function that maps the hash code to an integer within a range of indices, [0, N − 1], for a bucket array.
Sign up to read
Learn more about book
eBook - PDF
Objects, Abstraction, Data Structures and Design
Using C++
- Elliot B. Koffman, Paul A. T. Wolfgang(Authors)
- 2012(Publication Date)
- Wiley
  (Publisher)
Suppose the fragment in Self-Check Exercise 2 were modified to create a mul- timap, b_map, instead. Show that multimap. What would be displayed if you accessed the entries using an iterator and displayed the value of the first and second data fields of each entry on a line separated by a colon? All values for a given key should appear on the same line separated by commas. P R O G R A M M I N G 1. Write statements to create a map object that stores each word occurring in a term paper along with the number of times the word occurs. 2. Write statements to perform the display described in Self-Check Exercise 2. 3. Write statements to perform the display described in Self-Check Exercise 3. 9.3 Hash Tables The C++ standard library uses a special type of binary search tree, called a balanced binary search tree, to implement the set and map classes. This provides access to items in O(log n) time. Sets and maps can also be implemented using a data struc- ture known as a hash table, which has some advantages over balanced search trees. The goal behind the hash table is to be able to access an entry based on its key value, not its location. In other words, we want to be able to access an element directly through its key value (associative retrieval), rather than having to determine its loca- tion first by searching for the key value in an array or a tree. Using a hash table enables us to retrieve an item in constant time (expected O(1)). We say expected O(1) rather than just O(1), because there will be some cases where the performance 530 Chapter 9 Sets and Maps will be much worse than O(1) and may even be O(n), but on the average we expect that it will be O(1). Hash Codes and Index Calculation The basis of hashing (and hash tables) is to transform the item’s key value to an inte- ger value (its hash code) which will then be transformed into a table index. Figure 9.3 illustrates this process for a table of size n. We discuss how this might be done in the next few examples.
Sign up to read
Learn more about book

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.

Explore more topic indexes

1 of 8

View all

Hash Maps

Related key terms

10 Key excerpts on "Hash Maps"

Data Structures and Program Design Using C

A Self-Teaching Introduction

Hands-On Data Structures and Algorithms with Python

A Textbook of Data Structures and Algorithms, Volume 3

Mastering Advanced Data Structures and Algorithm Design Strategies

Learning JavaScript Data Structures and Algorithms

Dictionaries and Hashes

The dictionary data structure

F# for Scientists

Essential Algorithms

A Practical Approach to Computer Algorithms Using Python and C#

Advanced Data Structures

Data Structures and Algorithms in Python

Data Structures and Algorithms in Java

Objects, Abstraction, Data Structures and Design

Using C++

Explore more topic indexes