How to calculate one region's actual data size by function execution service in VMware GemFire
search cancel

How to calculate one region's actual data size by function execution service in VMware GemFire

book

Article ID: 294322

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

You want to calculate the size of the specified region in terms of bytes (MB/GB) in VMware GemFire.

Environment

Product Version: 9.0

Resolution

This article covers a sample implementation to calculate the size of the specified region by utilizing function execution service and spring boot.


Function Execution Service Usage

1. Function Execution Service Name: region-size-calculator

2. Function Execution Service Input Parameters:
  • Required argument - the name of the region.
  • Optional argument - the number of samples to take. If you have a region with 1 billion entries, you may deem it unnecessary to go through each entry and calculate its size. For this reason, this argument will limit the number of entries to sample and the total size will be projected from the results * the number of entries in the region.
  • Function execution arguments in gfsh are comma-delimited strings
3. Function Execution Service Result Output Items (the size unit is Byte):
  • Deserialized values size
  • Serialized values size
  • Keys size
  • Region type
  • Entries
For example:
gfsh>execute function --id=region-size-calculator --arguments="exampleRegion,10" --member=server1
Execution summary

         Member ID/Name          | Function Execution Result
-------------------------------- | ----------------------------------------------------------------------------------------------------------------
172.17.0.2(server1:162)<v1>:1025 | [{Deserialized values size=720, Serialized values size=170, Keys size=480, Region type=Partitioned, Entries=10}]


How to install this utility

1. Download this implementation from github repository to a local environment.

For example:
git clone https://github.com/GSSJacky/RegionSize-Calculator-Gemfire9

2. Config the .m2/settings.xml with your registered account of "Pivotal Commercial Maven Repository" by referring this VMware GemFire documentation.
<settings>
       <servers>
           <server>
               <id>gemfire-release-repo</id>
               <username>[email protected]</username>
               <password>MY-DECRYPTED-PASSWORD</password>
           </server>
       </servers>
</settings>

3. Use mvn compile/mvn package command or IDE such as Visual Studio Code build the project, then you will find a functions-0.0.1-SNAPSHOT.jar from target folder.

For example:
XXXXX_MBP:functions XXXXX$ mvn clean package
[INFO] Scanning for projects...
[INFO] 
[INFO] ------------------------< io.pivotal:functions >------------------------
[INFO] Building functions 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
......
[INFO] --- maven-compiler-plugin:3.8.1:compile (default-compile) @ functions ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 2 source files to /Users/jaxu/Downloads/RegionSize-Calculator-Gemfire9/target/classes
[INFO] /Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/src/main/java/io/pivotal/utils/SizeCalculator.java: Some input files use or override a deprecated API.
[INFO] /Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/src/main/java/io/pivotal/utils/SizeCalculator.java: Recompile with -Xlint:deprecation for details.
[INFO] /Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/src/main/java/io/pivotal/functions/SizeCalculatorFunction.java: /Users/jaxu/Downloads/RegionSize-Calculator-Gemfire9/src/main/java/io/pivotal/functions/SizeCalculatorFunction.java uses unchecked or unsafe operations.
[INFO] /Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/src/main/java/io/pivotal/functions/SizeCalculatorFunction.java: Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:3.1.0:testResources (default-testResources) @ functions ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/jaxu/Downloads/RegionSize-Calculator-Gemfire9/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.8.1:testCompile (default-testCompile) @ functions ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.22.2:test (default-test) @ functions ---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:3.1.2:jar (default-jar) @ functions ---
[INFO] Building jar: /Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/target/functions-0.0.1-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  3.187 s
[INFO] Finished at: 2020-01-27T18:03:04+09:00
[INFO] ------------------------------------------------------------------------

4. You can deploy the functions-0.0.1-SNAPSHOT.jar into a GemFire cluster by executing the following gfsh command:
gfsh>deploy --jar=/Users/xxxxx/Downloads/RegionSize-Calculator-Gemfire9/target/functions-0.0.1-SNAPSHOT.jar

Deploying files: functions-0.0.1-SNAPSHOT.jar
Total file size is: 0.01MB

Continue?  (Y/n): Y
      Member        |         Deployed JAR         | Deployed JAR Location
------------------- | ---------------------------- | ------------------------------------------------------------------------------------
server1             | functions-0.0.1-SNAPSHOT.jar | /Users/xxxxx/Downloads/vsc_work/clusters/cacheserver1/functions-0.0.1-SNAPSHOT.v2.jar

gfsh>list deployed
      Member        |             JAR              | JAR Location
------------------- | ---------------------------- | ------------------------------------------------------------------------------------
server1             | functions-0.0.1-SNAPSHOT.jar | /Users/xxxxx/Downloads/work/clusters/cacheserver1/functions-0.0.1-SNAPSHOT.v2.jar


Considerations

The cost of calculating object sizes in a high-speed data grid is too expensive to perform on each entry. This Region Size Calculator calculates the sizes based on the following a few considerations.
  • VMware GemFire stores data in serialized form. It will store the object in a deserialized form in some circumstances temporarily. This deserialized object will later be garbage collected. Therefore, the actual region size will flux depending on your operations. If you store your objects using PDX and do queries with “Select *”, GemFire will store the object in deserialized form until the next GC. If your queries use field names, such as “Select lastName, firstName”, GemFire will maintain the object in serialized form. Function execution will also affect PDX deserialization. If your function casts a PDX object to its domain object, the object will be stored in deserialized form on that node and that node only temporarily.
  • The Region Size Calculator will return both the size of the deserialized storage and serialized storage. You can estimate the real size of the region based on your use. If you do not use “Select *” and do not cast PDX objects to the Domain object in functions, your region size will be the sum of the keys and the deserialized values.