Computer Science

Garbage Collection

Garbage collection is an automated process in computer programming that identifies and removes unused objects from memory. It helps to free up memory space and prevent memory leaks, which can cause programs to crash or slow down. Garbage collection is commonly used in high-level programming languages such as Java and Python.

Written by Perlego with AI-assistance

12 Key excerpts on "Garbage Collection"

  • Book cover image for: Computer Memory Management
    ____________________ WORLD TECHNOLOGIES ____________________ Chapter-4 Garbage Collection (Computer Science) In computer science, Garbage Collection ( GC ) is a form of automatic memory management. It is a special case of resource management , in which the limited resource being managed is memory. The garbage collector , or just collector , attempts to reclaim garbage , or memory occupied by objects that are no longer in use by the program. Garbage Collection was invented by John McCarthy around 1959 to solve problems in Lisp. Garbage Collection is often portrayed as the opposite of manual memory management, which requires the programmer to specify which objects to deallocate and return to the memory system. However, many systems use a combination of the two approaches, and other techniques such as stack allocation and region inference can carve off parts of the problem. There is an ambiguity of terms, as theory often uses the terms manual Garbage Collection and automatic Garbage Collection rather than manual memory management and Garbage Collection , and does not restrict Garbage Collection to memory management, rather considering that any logical or physical resource may be garbage collected. Garbage Collection does not traditionally manage limited resources other than memory that typical programs use, such as network sockets, database handles, user interaction windows, and file and device descriptors. Methods used to manage such resources, particularly destructors, may suffice as well to manage memory, leaving no need for GC. Some GC systems allow such other resources to be associated with a region of memory that, when collected, causes the other resource to be reclaimed; this is called finalization . Finalization may introduce complications limiting its usability, such as intolerable latency between disuse and reclaim of especially limited resources, or a lack of control over which thread performs the work of reclaiming.
  • Book cover image for: Computer Memory and its Management
    ________________________ WORLD TECHNOLOGIES ________________________ Chapter 14 Garbage Collection (Computer Science) In computer science, Garbage Collection ( GC ) is a form of automatic memory management. It is a special case of resource management , in which the limited resource being managed is memory. The garbage collector , or just collector , attempts to reclaim garbage , or memory occupied by objects that are no longer in use by the program. Garbage Collection was invented by John McCarthy around 1959 to solve problems in Lisp. Garbage Collection is often portrayed as the opposite of manual memory management, which requires the programmer to specify which objects to deallocate and return to the memory system. However, many systems use a combination of the two approaches, and other techniques such as stack allocation and region inference can carve off parts of the problem. There is an ambiguity of terms, as theory often uses the terms manual Garbage Collection and automatic Garbage Collection rather than manual memory management and Garbage Collection , and does not restrict Garbage Collection to memory management, rather considering that any logical or physical resource may be garbage collected. Garbage Collection does not traditionally manage limited resources other than memory that typical programs use, such as network sockets, database handles, user interaction windows, and file and device descriptors. Methods used to manage such resources, particularly destructors, may suffice as well to manage memory, leaving no need for GC. Some GC systems allow such other resources to be associated with a region of memory that, when collected, causes the other resource to be reclaimed; this is called finalization . Finalization may introduce complications limiting its usability, such as intolerable latency between disuse and reclaim of especially limited resources, or a lack of control over which thread performs the work of reclaiming.
  • Book cover image for: Build Your Own Programming Language
    eBook - ePub

    Build Your Own Programming Language

    A developer's comprehensive guide to crafting, compiling, and implementing programming languages

    17

    Garbage Collection

    Memory management is one of the most important aspects of modern programming. Almost any language that you invent should provide automatic heap memory management via Garbage Collection. Garbage Collection refers to any mechanism by which heap memory is automatically freed and made available for reuse when it is no longer needed for a given purpose. The heap, as you may already know, is the region in memory from which objects are allocated by some explicit means such as the reserved word new (in Java). In lower-level languages, such objects live until the program explicitly frees them, but in many modern languages, heap objects are retained in memory as long as they are needed. After a heap object is of no further use in the program, its memory is made available to the program for other purposes by a Garbage Collection algorithm that runs behind the scenes in the programming language runtime system.
    This chapter presents a couple of methods with which you can implement Garbage Collection in your language. The first method, called reference counting , is not very difficult to implement and has the advantage of freeing memory incrementally as soon as the program is no longer using it. However, reference counting has a fatal flaw, which we will discuss in the section titled The drawbacks and limitations of reference counting . The second method, called mark-and-sweep collection , is a more robust mechanism, but it is much more challenging to implement. It has the downside that execution pauses periodically for however long the Garbage Collection process takes. These are just two of many possible approaches to memory management; for a more advanced and in-depth treatment of the subject, you may want to check out The Garbage Collection Handbook
  • Book cover image for: Build Your Own Programming Language
    Chapter 16 : Garbage Collection
    Memory management is one of the most important aspects of modern programming, and almost any language that you invent should provide automatic memory management via Garbage Collection . This chapter presents a couple of methods with which you can implement Garbage Collection in your language. The first method, called reference counting , is easy to implement and has the advantage of freeing memory as you go. However, reference counting has a fatal flaw. The second method, called mark-and-sweep collection , is a more robust mechanism that is much more challenging to implement, and it has the downside that execution pauses periodically for however long the Garbage Collection process takes. These are two of many possible approaches to memory management. Implementing a garbage collector with neither a fatal flaw nor periodic pauses to collect free memory is liable to have other costs associated with it.
    This chapter covers the following main topics:
    • Appreciating the importance of Garbage Collection
    • Counting references to objects
    • Marking live data and sweeping the rest
    The goal of this chapter is to explain to you why Garbage Collection is important and show you how you can do it. The skills you'll learn include the following: making objects track how many references are pointing to them; identifying all the pointers to live data in programs and including pointers located within other objects; freeing memory and making it available for reuse. Let's start with a discussion of why you should bother with all this anyway.

    Appreciating the importance of Garbage Collection

    In the beginning, programs were small, and the static allocation of memory was decided when a program was designed. The code was not that complicated, and programmers could lay out all the memory that they were going to use during the entire program as a set of global variables. Life was good.
    Then, Moore's Law happened, and computers got bigger. Customers started demanding that programs handle arbitrary-sized data instead of accepting the fixed upper limits inherent in static allocation. Programmers invented structured programming and used function calls to organize larger programs in which most memory allocation was on the stack
  • Book cover image for: Mastering Java 11
    No longer available |Learn more

    Mastering Java 11

    Develop modular and secure Java applications using concurrency and advanced JDK libraries, 2nd Edition

    Our goal is to deallocate or release the memory, any previously allocated memory that we no longer need. Fortunately, with Java, there is a mechanism for handling this issue. It is called Garbage Collection.
    When an object, such as our myObjectName example, no longer has any references pointing to it, the system will reallocate the associated memory.
    Passage contains an image

    Object destruction

    The idea of Java having a garbage collector running in the dark shadows of your code (usually a low-priority thread) and deallocating memory currently allocated to unreferenced objects, is appealing. So, how does this work? The Garbage Collection system monitors objects and, as feasible, counts the number of references to each object.
    When there are no references to an object, there is no way to get to it with the currently running code, so it makes perfect sense to deallocate the associated memory.
    The term memory leak refers to small memory chunks lost or improperly deallocated. These leaks are avoidable with Java's Garbage Collection.
    Passage contains an image

    Garbage Collection algorithms

    There are several Garbage Collection algorithms, or types, for use by the JVM. In this section, we will cover the following Garbage Collection algorithms:
    • Mark and sweep
    • Concurrent Mark Sweep (CMS ) Garbage Collection
    • Serial Garbage Collection
    • Parallel Garbage Collection
    • G1 Garbage Collection
    Passage contains an image

    Mark and sweep

    Java's initial Garbage Collection algorithm, mark and sweep used a simple two-step process:
    1. The first step, mark, is to step through all objects that have accessible references, marking those objects as alive
    1. The second step, sweep, involves scanning the sea for any object that is not marked
    As you can readily determine, the mark and sweep algorithm seems effective but is probably not very efficient due to the two-step nature of this approach. This eventually led to a Java Garbage Collection system with vastly improved efficiencies.
    Passage contains an image

    Concurrent Mark Sweep (CMS) Garbage Collection

    The CMS algorithm for Garbage Collection scans heap memory using multiple threads. Similar to the mark and sweep method, it marks objects for removal and then makes a sweep to actually remove those objects. This method of Garbage Collection is essentially an upgraded mark and sweep method. It was modified to take advantage of faster systems and had performance enhancements.
  • Book cover image for: Scripting with Objects
    eBook - ePub

    Scripting with Objects

    A Comparative Presentation of Object-Oriented Scripting with Perl and Python

    • Avinash C. Kak(Author)
    • 2017(Publication Date)
    • Wiley
      (Publisher)
    So even though your memory manager says that a large amount of memory is still available, it may nonetheless fail to carry out the next allocation. Another problem related to memory management is known as the memory locality problem or the problem of poor locality of reference. If two objects that need to refer to each other are far separated in the address space, an operating system using virtual memory may be able to keep only one of them in its active memory at any time. This could extract a performance penalty at run time if both objects are being used simultaneously and one or the other object needs to be paged in constantly. 4 An important component of any modern memory management system is automatic Garbage Collection. A Garbage Collection system is supposed to figure out what objects are no longer needed by a program and to then free up the memory occupied by such objects. Several basic strategies for Garbage Collection include: Reference Counting: This is probably the simplest approach and also the one most frequently used with scripting languages like Perl and Python. In reference-counting-based Garbage Collection, every time a new variable acquires a reference to an object, the reference count associated with the object goes up by one. And every time a variable loses a previously held reference to an object (say, because the variable went out of scope), the reference count associated with the object decreases by one. When the reference count associated with an object goes to zero, the object is deallocated. Reference-counting-based Garbage Collection becomes transparent to the user if the compiler supports it. If the compiler generates the code needed for incrementing and decrementing the reference counts, the Garbage Collection process becomes smoothly integrated with the run-time execution of the script. What that implies is that, at run time, the execution of the script does not have to be paused to clean up the memory
  • Book cover image for: Introduction to Programming Languages
    Garbage Collection recycles memory from the released state to the free state . Dynamic data cells can be released automatically or programmatically. During the process of continuous allocation and deallocation, the free space keeps getting interspersed into isolated smaller memory chunks. Many of the memory blocks are so small that they individually cannot be used for effective memory allocation. This kind of formation of interspersed small blocks is called fragmentation , and it negatively affects the execution time performance of garbage collectors and program execution by decreasing the hit ratio in cache and increasing the frequency of page faults and Garbage Collections. Garbage Collection can be done continuously as in reference-count Garbage Collection or periodically after no more memory can be allocated in the heap space. Periodic garbage collectors can be start-and-stop, and incremental. Start-and-stop garbage collectors suspend the program execution completely during Garbage Collection and are unsuitable for handling real-time events, since Garbage Collection causes significant delay due to memory and execution-time overhead. To avoid this problem, many approaches have been tried such as incremental Garbage Collection, concurrent Garbage Collection, and continuous Garbage Collection. In incremental Garbage Collection, one big Garbage Collection period is divided into multiple smaller Garbage Collection periods interleaved with small periods of program execution, such that the collection rate is faster than the memory allocation rate. Concurrent Garbage Collection runs program execution and Garbage Collection simultaneously. However, they have to handle the issues of atomicity of multiple instructions involved in a common atomic operations, and provide synchronization while sharing the memory location between the garbage collector and program execution.
  • Book cover image for: The Compiler Design Handbook
    eBook - PDF

    The Compiler Design Handbook

    Optimizations and Machine Code Generation, Second Edition

    • Y.N. Srikant, Priti Shankar, Y.N. Srikant, Priti Shankar(Authors)
    • 2018(Publication Date)
    • CRC Press
      (Publisher)
    It also establishes several metrics on the basis of which garbage collectors are evaluated and identifies the distinguishing features of garbage collectors. 6.1.1 The Need for Garbage Collection A program in execution needs memory to store the data manipulated by it. The data is named by variables in the program. Memory is allocated in various ways that differ from each other in their answers to the following questions: At what point of time is a variable bound to a chunk of memory, and how long does the binding last? 6 -1 6 -2 The Compiler Design Handbook: Optimizations and Machine Code Generation In the earliest form of memory allocation, called static allocation , the binding is established at compile time and does not change throughout the program execution. In the case of stack allocation , the binding is created during the invocation of the function that has the variable in its scope and lasts for the lifetime of the function. In heap allocation , the binding is created explicitly by executing a statement that allocates a chunk of memory and explicitly binds an access expression to it. An access expression is a generalization of a variable and denotes an address. In its general form, it is a reference or a pointer variable followed by a sequence of field names. One of the ways in which the binding can be undone is by disposing of the activation record that contains the pointer beginning the access expression. Then the access expression ceases to have any meaning. The other way is to execute an assignment statement that will bind the access expression to a different memory chunk. After a binding is changed, the chunk of memory may be inaccessible. An important issue here is the reclamation of such unreachable memory. The reclaimed memory can be subsequently allocated to a different access expression.
  • Book cover image for: Unity 2017 Game Optimization - Second Edition
    Managed Heap ). This heap space starts off fairly small, less than 1 Megabyte, but will grow as new blocks of memory are needed by our script code. This space can also shrink by releasing it back to the OS if Unity determines that it's no longer needed.
    Passage contains an image

    Garbage Collection

    The Garbage Collector (hereafter referred to as the GC ) has an important job, which is to ensure that we don't use more Managed Heap memory than we need, and that memory that is no longer needed will be automatically deallocated. For instance, if we create a GameObject, and then later destroy it, the GC will flag the memory space used by the GameObject for eventual deallocation later. This is not an immediate process, as the GC only deallocates memory when necessary.
    When a new memory request is made, and there is enough empty space in the Managed Heap to satisfy the request, the GC simply allocates the new space and hands it over to the caller. However, if the Managed Heap does not have room for it, then the GC will need to scan all the existing memory allocations for anything that is no longer being used and cleans them up first. It will only expand the current heap space as a last resort.
    The GC in the version of Mono that Unity uses is a type of Tracing Garbage Collector, which uses a Mark-and-Sweep strategy. This algorithm works in two phases: each allocated object is tracked with an additional bit. This flags whether the object has been marked or not. These flags start off set to false to indicate that it has not yet been marked.
    When the collection process begins, it marks all objects that are still reachable to the program by setting their flags to true. Either the reachable object is a direct reference, such as static or local variables on the stack, or it is an indirect reference through the fields (member variables) of other directly or indirectly accessible objects. In essence, it is gathering a set of objects that are still referenceable to our application. Everything that is not still referenceable would be effectively invisible to our application and can be deallocated by the GC.
  • Book cover image for: Professional C# 6 and .NET Core 1.0
    • Christian Nagel(Author)
    • 2016(Publication Date)
    • Wrox
      (Publisher)
    That is the power of reference data types, and you will see this feature used extensively in C# code. It means that you have a high degree of control over the lifetime of your data, because it is guaranteed to exist in the heap as long as you are maintaining some reference to it.

    Garbage Collection

    The previous discussion and diagrams show the managed heap working very much like the stack, to the extent that successive objects are placed next to each other in memory. This means that you can determine where to place the next object by using a heap pointer that indicates the next free memory location, which is adjusted as you add more objects to the heap. However, things are complicated by the fact that the lives of the heap-based objects are not coupled with the scope of the individual stack-based variables that reference them.
    When the garbage collector runs, it removes all those objects from the heap that are no longer referenced. The GC finds all referenced objects from a root table of references and continues to the tree of referenced objects. Immediately after, the heap has objects scattered on it, which are mixed up with memory that has just been freed (see Figure 5.5 ).
    Figure 5.5
     
    If the managed heap stayed like this, allocating space for new objects would be an awkward process, with the runtime having to search through the heap for a block of memory big enough to store each new object. However, the garbage collector does not leave the heap in this state. As soon as the garbage collector has freed all the objects it can, it compacts the heap by moving all the remaining objects to form one continuous block of memory. This means that the heap can continue working just like the stack, as far as locating where to store new objects. Of course, when the objects are moved about, all the references to those objects need to be updated with the correct new addresses, but the garbage collector handles that, too.
    This action of compacting by the garbage collector is where the managed heap works very differently from unmanaged heaps. With the managed heap, it is just a question of reading the value of the heap pointer, rather than iterating through a linked list of addresses to find somewhere to put the new data.
  • Book cover image for: Java 11 Cookbook
    No longer available

    Java 11 Cookbook

    A definitive guide to learning the key concepts of modern application development, 2nd Edition

    • Nick Samoylov, Mohamed Sanaulla(Authors)
    • 2018(Publication Date)
    • Packt Publishing
      (Publisher)

    Memory Management and Debugging

    In this chapter, we will cover the following recipes:
    • Understanding the G1 garbage collector
    • Unified logging for JVM
    • Using the jcmd command for JVM
    • Try-with-resources for better resource handling
    • Stack walking for improved debugging
    • Using the memory-aware coding style
    • Best practices for better memory usage
    • Understand Epsilon, a low-overhead garbage collector
    Passage contains an image

    Introduction

    Memory management is the process of memory allocation for program execution and memory reuse after some of the allocated memory is not used anymore. In Java, this process is called Garbage Collection (GC ). The effectiveness of GC affects two major application characteristics—responsiveness and throughput.
    Responsiveness is measured by how quickly an application responds to the request. For example, how quickly a website returns a page or how quickly a desktop application responds to an event. Naturally, the lower the response time, the better the user experience, which is the goal for many applications.
    Throughput indicates the amount of work an application can do in a unit of time. For example, how many requests a web application can serve or how many transactions a database can support. The bigger the number, the more value the application can potentially generate and the greater number of users it can accommodate.
    Not every application needs to have the minimal possible responsiveness and the maximum achievable throughput. An application may be an asynchronous submit-and-go-do-something-else, which does not require much user interaction. There may be a few potential application users too, so a lower-than-average throughput could be more than enough. Yet, there are applications that have high requirements to one or both of these characteristics and cannot tolerate long pauses imposed by the GC process.
    GC, on the other hand, needs to stop any application execution once in a while to reassess the memory usage and to release it from data no longer used. Such periods of GC activity are called stop-the-world. The longer they are, the quicker the GC does its job and the longer an application freeze lasts, which can eventually grow big enough to affect both the application responsiveness and throughput. If that is the case, the GC tuning and JVM optimization become important and require an understanding of the GC principles and their modern implementations.
  • Book cover image for: Unity Game Optimization
    No longer available |Learn more

    Unity Game Optimization

    Enhance and extend the performance of all aspects of your Unity games, 3rd Edition

    • Dr. Davide Aversa, Chris Dickinson(Authors)
    • 2019(Publication Date)
    • Packt Publishing
      (Publisher)
    The GC has an important job, which is to ensure that we don't use more managed heap memory than we need, and that memory that is no longer needed will be automatically deallocated. For instance, if we create GameObject and then later destroy it, the GC will flag the memory space used by GameObject for eventual deallocation later. This is not an immediate process, as the GC only deallocates memory when necessary.
    When a new memory request is made, and there is enough empty space in the managed heap to satisfy the request, the GC simply allocates the new space and hands it over to the caller. However, if the managed heap does not have room for it, then the GC will need to scan all of the existing memory allocations for anything that is no longer being used and cleans them up first. It will only expand the current heap space as the last resort.
    The GC in the version of Mono that Unity uses is a type of tracing GC, which uses a Mark-and-Sweep strategy. This algorithm works in two phases: each allocated object is tracked with an additional bit. This flags whether the object has been marked or not. These flags start set to false to indicate that it has not yet been marked.
    When the collection process begins, it marks all objects that are still reachable to the program by setting their flags to true. Either the reachable object is a direct reference, such as static or local variables on the stack, or it is an indirect reference through the fields (member variables) of other directly or indirectly accessible objects. In essence, it is gathering a set of objects that are still referenceable to our application. Everything that is not still referenceable would be effectively invisible to our application and can be deallocated by the GC.
    The second phase involves iterating through this catalog of references (which the GC will have kept track of throughout the lifetime of the application) and determining whether or not it should be deallocated based on its marked
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.