Computer Science

Hash Structure

A hash structure is a data structure that uses a hash function to map keys to values. It allows for efficient retrieval and storage of data by providing constant-time access to elements. Hash structures are commonly used in programming languages and databases.

Written by Perlego with AI-assistance

4 Key excerpts on "Hash Structure"

  • Book cover image for: Data Structures and Program Design Using C
    No longer available |Learn more

    Data Structures and Program Design Using C

    A Self-Teaching Introduction

    Answer. Hashing is the process of mapping keys to their appropriate locations in the hash table. It is the most effective technique of searching the values in an array or in a hash table. 430 • DATA STRUCTURES AND PROGRAM DESIGN USING C 10.1.2 Hash Tables A hash table is a data structure which supports one of the efficient searching techniques, that is, hashing. A hash table is an array in which the data is accessed through a special index called a key. In a hash table, keys are mapped to the array positions by a hash function. A hash function is a function, or we can say that it is a mathematical formula, which when applied to a key, produces an integer which is used as an index to find a key in the hash table. Thus, a value stored in a hash table can be searched in O(1) time with the help of a hash function. The main idea behind a hash table is to establish a direct mapping between the keys and the indices of the array. FIGURE 10.3. Mapping of keys using a direct addressing method. HASHING • 431 10.1.3 Hash Functions A hash function is a mathematical formula which when applied to a key, produces an integer which is used as an index to find a key in the hash table. Characteristics of the Hash Function There are four main characteristics of hash functions which are: 1. The hash function uses all the input data. 2. The hash function must generate different hash values. 3. The hash value is fully determined by the data being hashed. 4. The hash function must distribute the keys uniformly across the entire hash table. FIGURE 10.4. Mapping of keys to the hash table using hashing. 432 • DATA STRUCTURES AND PROGRAM DESIGN USING C Different Types of Hash Functions In this section, we will discuss some of the common hash functions: 1. Division Method – In the division method, a key k is mapped into one of the m slots by taking the remainder of k divided by m.
  • Book cover image for: A Textbook of Data Structures and Algorithms, Volume 3
    eBook - PDF

    A Textbook of Data Structures and Algorithms, Volume 3

    Mastering Advanced Data Structures and Algorithm Design Strategies

    • G. A. Vijayalakshmi Pai(Author)
    • 2022(Publication Date)
    • Wiley-ISTE
      (Publisher)
    Random access is one in which the data elements of the dictionary are accessed according to no particular order. 2 A Textbook of Data Structures and Algorithms 3 Hash tables are ideal data structures for dictionaries. In this chapter, we introduce the concept of hashing and hash functions. The structure and operations of the hash tables are also discussed. The various methods of collision resolution, for example, linear open addressing and chaining and their performance analyses are detailed. Finally, the application of hash tables in the fields of compiler design, relational database query processing and file organization are discussed. 13.2. Hash table structure A hash function H(X) is a mathematical function which, when given a key X of the dictionary D maps it to a position P in a storage table termed hash table. The process of mapping the keys to their respective positions in the hash table is called hashing. Figure 13.1 illustrates a hash function. Figure 13.1. Hashing a key When the data elements of the dictionary are to be stored in the hash table, each key X i is mapped to a position P i in the hash table as determined by the value of H(X i ), that is, P i = H(X i ). To search for a key X in the hash table all that one does is determine the position P by computing P = H(X) and accessing the appropriate data element. In the case of insertion of a key X or its deletion, the position P in the hash table where the data element needs to be inserted or from where it is to be deleted respectively, is determined by computing P = H(X). If the hash table is implemented using a sequential data structure, for example, arrays, then the hash function H(X) may be so chosen to yield a value that corresponds to the index of the array. In such a case the hash function is a mere mapping of the keys to the array indices. Hash Tables 3 EXAMPLE 13.1.– Consider a set of distinct keys { AB12, VP99, RK32, CG45, KL78, OW31, ST65, EX44 } to be represented as a hash table.
  • Book cover image for: Advanced Data Structures
    9 Hash Tables Hash tables are a dictionary structure of great practical importance and can be very efficient. The underlying idea is quite simple: we have a universe U and want to store a set of objects with keys from U . We also have s buckets and a function h from U to S = {0, . . . , s − 1}. Then we store the object with key u in the h(u)th bucket. If several objects that we want to store are mapped to the same bucket, we have a collision between these objects. If there are no collisions, then we can realize the buckets just as an array, each array entry having space for one object. The theory of hash tables mainly deals with the questions of what to do about the collisions and how to choose the function h in such a way that the number of collisions is small. The idea of hash tables is quite old, apparently starting in several groups at IBM in 1953 (Knott 1972). For a long time the main reason for the popularity of hash tables was the simple implementation; the hash functions h were chosen ad hoc as some unintelligible way to map the large universe to the small array allocated for the table. It was the practical programmer’s dictionary structure of choice, easily written and conceptually understood, with no performance guarantees, and it still exists in this style in many texts aimed at that group. The development and analysis of hash table methods that are provably good in some sense started only in the 1980s, and now a well-designed hash table can indeed be a very efficient structure. 9.1 Basic Hash Tables and Collision Resolution If we map the keys of a big universe U to a small set S = {0, . . . , s − 1}, then it is unavoidable that many universe elements are mapped to the same element of S . In a dictionary structure, we do not have to store the entire universe, but only some set X ⊂ U of n keys for the objects currently in the dictionary. But if we 374
  • Book cover image for: Fundamentals of Database Indexing and Searching
    Part II Low-Dimensional Index Structures 13 This page intentionally left blank This page intentionally left blank Chapter 2 Hashing Hashing is a mathematical function that transforms a key k to be searched to a location h ( k ) where the contents corresponding to the key can be found. An example of a very simple hash function with m locations is h ( k ) = k mod m . A good hash function should exhibit the following two properties: • Uniform : The total domain of keys should be distributed uniformly over the range. • Random : The hash values should be distributed uniformly irre-spective of the distribution of the keys. It is easy to see that more than one key can hash to the same location for a hash function. This phenomenon is called collision and the ways to handle collisions are called collision resolution mechanisms. In a database context, the hash locations are disk pages or buckets that can contain a multiple number of keys and the corresponding ob-jects. Hence, the hash function maps a key to a bucket. The searching of a key within a bucket is a simple linear scan. Hence, the concept of collision is replaced by that of overflow , which happens when there is no more space in a hash bucket to store any more keys. Overflow can happen either due to a skew in the distribution of the keys or due to the non-uniformity of the hash function. The chances of overflowing can only be reduced, but can never be eliminated completely. Hash functions strive to complete searching for any key in O (1) time, i.e., within a constant number of steps. Thus, in a database querying context, hash functions are desirable for point queries. However, they cannot support range and kNN queries well. This chapter describes the various static and dynamic hashing tech-niques in the context of database queries. 15
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.