Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-10323

OffHeapStorageJUnitTest testCreateOffHeapStorage fails with AssertionError: expected:<100> but was:<1048576>

    XMLWordPrintableJSON

Details

    Description

      [Please see 1st comment on this ticket below the description for recommendation on how to fix.]

      OffHeapStorageJUnitTest testCreateOffHeapStorage started failing intermittently due to commit a350ed2 which went in as "GEODE-10087: Enhance off-heap fragmentation visibility. (#7407)".

      The failure stack looks like:

      OffHeapStorageJUnitTest > testCreateOffHeapStorage FAILED
          java.lang.AssertionError: expected:<100> but was:<1048576>
              at org.junit.Assert.fail(Assert.java:89)
              at org.junit.Assert.failNotEquals(Assert.java:835)
              at org.junit.Assert.assertEquals(Assert.java:647)
              at org.junit.Assert.assertEquals(Assert.java:633)
              at org.apache.geode.internal.offheap.OffHeapStorageJUnitTest.testCreateOffHeapStorage(OffHeapStorageJUnitTest.java:220)
      

      The cause is in MemoryAllocatorImpl. A new scheduled executor updateNonRealTimeStatsExecutor was added near the bottom of the constructor which immediately gets scheduled to invoke freeList::updateNonRealTimeStats repeatedly at the specified frequency of updateOffHeapStatsFrequencyMs whenever an instance of MemoryAllocatorImpl is created.

      freeList::updateNonRealTimeStats then updates setLargestFragment and setFreedChunks.

      Scheduling or starting any sort of threads from a constructor is considered a bad practice and should only be done via some sort of activation method such as start() which can be invoked from the static factory method for MemoryAllocatorImpl. The unit tests would then continue to construct instances via a constructor that does NOT invoke start().

      OffHeapStorageJUnitTest testCreateOffHeapStorage, and possibly other tests, is written with the assumption that no other thread or component can or will be updating the OffHeapStats. Hence it is then safe for the test to setLargestFragment and then assert that it has the value passed in:

      219:      stats.setLargestFragment(100);
      220:      assertEquals(100, stats.getLargestFragment());
      

      Line 220 is the source of the assertion failure because updateNonRealTimeStatsExecutor is racing with the JUnit thread and changes the value of setLargestFragment immediately after the test sets the value to 100, but before the test invokes the assertion.

      Looking back at the PR test results we can see that stress-new-test failed 5 times with this exact failure before being merged to develop:

      Attachments

        Activity

          People

            alberto.gomez Alberto Gomez
            klund Kirk Lund
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: