{"id":359,"date":"2021-04-16T23:08:02","date_gmt":"2021-04-16T23:08:02","guid":{"rendered":"http:\/\/rainforestqa.com\/hunting-race-condition-in-android-10-emulator\/"},"modified":"2023-02-14T17:56:31","modified_gmt":"2023-02-14T17:56:31","slug":"hunting-race-condition-in-android-10-emulator","status":"publish","type":"post","link":"https:\/\/www.rainforestqa.com\/blog\/hunting-race-condition-in-android-10-emulator","title":{"rendered":"Hunting a race condition in the Android 10 Emulator"},"content":{"rendered":"\n<h4 class=\"wp-block-heading\">How we found a race condition in the AOSP with the Android emulator that affected the amount of heap space available to apps.<\/h4>\n\n\n\n<p>Rainforest supports testing native mobile applications on Android using the official android emulator from Google. Using emulators instead of real physical devices provides a bunch of benefits such as being able to reproduce issues locally (hard to do if you don\u2019t have the same hardware device, but trivial if you can run the same emulator), better isolation (no need to wipe\/worry about data leaking on a real device, we can just throw the entire emulator away and make a new one), faster turnaround (we are not limited by the number of physical devices), and some nice debugging features\/functionality that the emulator supports (location spoofing, virtual camera support, etc).<\/p>\n\n\n\n<p>One of our customers was experiencing periodic crashes of their application when testing in our Android 10 emulator. After some initial investigation it appeared that their application was exhausting the amount of heap space available leading to a crash. We found that in the instances where their application was crashing, the log messages would show that the heap was only 16 MiB. This was quite surprising to us since our emulators have ~4 GiB of memory. It was also very odd because further investigation (i.e.\u00a0<code>adb shell getprop dalvik.vm.heapsize<\/code>) showed that the heap size should have been\u00a0<code>576m<\/code>. So why was the application crashing after only using 16 MiB?<\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.rainforestqa.com\/blog\/hunting-race-condition-in-android-10-emulator\/#How_Androids_jvm_is_configured\" >How Android\u2019s jvm is configured<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.rainforestqa.com\/blog\/hunting-race-condition-in-android-10-emulator\/#Finding_the_race\" >Finding the race<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.rainforestqa.com\/blog\/hunting-race-condition-in-android-10-emulator\/#Making_sure_we_always_win_the_race\" >Making sure we always win the race<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_Androids_jvm_is_configured\"><\/span>How Android\u2019s jvm is configured<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To get to the bottom of this we needed to understand how the Android java virtual machine is configured. The short version is that on boot a process called\u00a0<code>zygote<\/code>\u00a0is started. This process launches a jvm and preloads some common Android classes into it. When any application is started, the\u00a0<code>zygote<\/code>\u00a0process is forked and the application starts running on the pre-initialized jvm instance. This saves the cost of initializing a new jvm instance every time an app starts.<\/p>\n\n\n\n<p>There is <a href=\"https:\/\/medium.com\/android-news\/android-application-launch-explained-from-zygote-to-your-activity-oncreate-8a8f036864b\" target=\"_blank\" rel=\"noopener\">a great post about how all of this works<\/a> if you\u2019re looking for more detail.<\/p>\n\n\n\n<p>The important thing here is that\u00a0<code>zygote<\/code>\u00a0is what controls the jvm runtime configuration, including the jvm heap size. Since Android is open source we can go look at the source to see how the jvm is configured.<\/p>\n\n\n\n<p>In <a href=\"https:\/\/cs.android.com\/android\/platform\/superproject\/+\/android-10.0.0_r30:frameworks\/base\/core\/jni\/AndroidRuntime.cpp;l=770\" target=\"_blank\" rel=\"noopener\">frameworks\/base\/core\/jni\/AndroidRuntime.cpp:770<\/a> we find:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/*\n  * The default starting and maximum size of the heap.  Larger\n  * values should be specified in a product property override.\n  *\/\nparseRuntimeOption(\"dalvik.vm.heapsize\", heapsizeOptsBuf, \"-Xmx\", \"16m\");<\/code><\/pre>\n\n\n\n<p>This argument should look familiar if you\u2019ve worked with Java in the past:\u00a0<code>-Xmx<\/code>\u00a0is the standard way to control the heap size in a Java application. This line of code sets the heap size to the value of the\u00a0<code>dalvik.vm.heapsize<\/code>\u00a0property with a default of\u00a0<code>16m<\/code>\u00a0if that property doesn\u2019t exist.<\/p>\n\n\n\n<p>It also turns out that all the jvm arguments are logged by the zygote process on startup so we can check what the value being set on boot is. In our case we were seeing&nbsp;<code>-Xmx 16m<\/code>&nbsp;as one of the arguments:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>zygote  : option&#91;0]=-Xzygote\nzygote  : option&#91;1]=-Xcheck:jni\nzygote  : option&#91;2]=exit\nzygote  : option&#91;3]=vfprintf\nzygote  : option&#91;4]=sensitiveThread\nzygote  : option&#91;5]=-verbose:gc\nzygote  : option&#91;6]=-Xms4m\nzygote  : option&#91;7]=-Xmx16m\nzygote  : option&#91;8]=-Xusejit:true<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Finding_the_race\"><\/span>Finding the race<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>After the Android emulator finishes booting we know\u00a0<code>dalvik.vm.heapsize<\/code>\u00a0has the correct value. But we know from the logs that when\u00a0<code>zygote<\/code>\u00a0initializes it\u2019s being set to 16m. This means the value is either 16m at that time, or it\u2019s unset and falling back to the default.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">System properties<\/h3>\n\n\n\n<p>System properties are loaded by the\u00a0<code>init<\/code>\u00a0binary before it starts executing the init scripts and read a few different files (see\u00a0<a href=\"https:\/\/cs.android.com\/android\/platform\/superproject\/+\/android-10.0.0_r30:system\/core\/init\/property_service.cpp;l=876\" target=\"_blank\" rel=\"noopener\">system\/core\/init\/property_service.cpp:876<\/a>\u00a0for the specific details); but in the case of the Android 10 emulator the only relevant one is\u00a0<code>\/system\/build.prop<\/code>. Checking this file reveals that there is no value for the\u00a0<code>dalvik.vm.heapsize<\/code>\u00a0property specified. Searching the filesystem for other property files that contain this property setting comes up empty as well.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Android Init<\/h3>\n\n\n\n<p>Inside\u00a0<a href=\"https:\/\/cs.android.com\/android\/platform\/superproject\/+\/android-10.0.0_r30:system\/core\/init\/init.cpp;l=648\" target=\"_blank\" rel=\"noopener\">system\/core\/init\/init.cpp:648<\/a>, a function called\u00a0<code>process_kernel_cmdline()<\/code>\u00a0runs which parses the kernel command line and creates Android properties out of what it finds there. These properties are created as\u00a0<code>ro.kernel.<\/code>. This way of setting properties is really only useful to the Android emulator so that it can allow users some settings knobs which impact these values inside the emulator. Since this is only setting\u00a0<code>ro.kernel.qemu.dalvik.vm.heapsize<\/code>, we\u2019re still left wondering how\u00a0<code>dalvik.vm.heapsize<\/code>\u00a0eventually gets set to the proper value.<\/p>\n\n\n\n<p>Digging into the&nbsp;<code>init<\/code>&nbsp;scripts, we discover in&nbsp;<a href=\"https:\/\/cs.android.com\/android\/platform\/superproject\/+\/android-10.0.0_r30:device\/generic\/goldfish\/init.ranchu.rc;l=36-37\" target=\"_blank\" rel=\"noopener\">vendor\/etc\/init\/hw\/init.ranchu.rc:36-37<\/a>&nbsp;a call to&nbsp;<code>setprop<\/code>&nbsp;which copies the value from the&nbsp;<code>ro.kernel<\/code>&nbsp;property to the real one. In the main&nbsp;<a href=\"https:\/\/cs.android.com\/android\/platform\/superproject\/+\/android-10.0.0_r30:system\/core\/rootdir\/init.rc;l=339-346\" target=\"_blank\" rel=\"noopener\">\/init.rc<\/a>&nbsp;we find out that&nbsp;<code>zygote<\/code>&nbsp;is asked to start before this. Simplifying that down, the relevant parts of the init scripts look something like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># in \/init.rc\non late-init\n    trigger zygote-start\n    trigger boot\n\non zygote-start\n    start zygote\n\n# in vendor\/etc\/init\/hw\/init.ranchu.rc\non boot\n    setprop dalvik.vm.heapsize ${ro.kernel.qemu.dalvik.vm.heapsize}<\/code><\/pre>\n\n\n\n<p>It\u2019s important to understand how these init scripts are parsed; thankfully this is\u00a0<a href=\"https:\/\/android.googlesource.com\/platform\/system\/core\/+\/master\/init\/README.md\" target=\"_blank\" rel=\"noopener\">described in detail in the AOSP source code<\/a>. To save you some reading, the important part is that\u00a0<code>start zygote<\/code>\u00a0is not synchronous (and even if it were,\u00a0<code>zygote<\/code>\u00a0does not immediately read the system property). The race should become clear now.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The race in action:<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Android boots, and eventually the main&nbsp;<code>init<\/code>&nbsp;binary starts<\/li>\n\n\n\n<li><code>init<\/code>&nbsp;parses the kernel cmdline, which sets&nbsp;<code>ro.kernel.qemu.dalvik.vm.heapsize<\/code>&nbsp;to the correct value the emulator provides&nbsp;<code>(576m)<\/code><\/li>\n\n\n\n<li><code>init<\/code>&nbsp;starts running the&nbsp;<code>rc init<\/code>&nbsp;scripts<\/li>\n\n\n\n<li><code>on late-init<\/code>&nbsp;runs, which triggers the&nbsp;<code>zygote-start<\/code>&nbsp;event<\/li>\n\n\n\n<li><code>on zygote-start<\/code>&nbsp;runs&nbsp;<code>start zygote<\/code>, which launches the&nbsp;<code>zygote<\/code>&nbsp;process asynchronously<\/li>\n\n\n\n<li><code>on late-init<\/code>&nbsp;continues running, and eventually triggers&nbsp;<code>boot<\/code><\/li>\n\n\n\n<li><em>The race is lost:<\/em>&nbsp;<code>zygote<\/code>&nbsp;reads the&nbsp;<code>dalvik.vm.heapsize<\/code>&nbsp;property, finds it unset, and defaults to&nbsp;<code>16m<\/code><\/li>\n\n\n\n<li><code>on boot<\/code>&nbsp;from&nbsp;<code>init.ranchu.rc<\/code>&nbsp;runs, and calls&nbsp;<code>setprop dalvik.vm.heapsize ${ro.kernel.qemu.dalvik.vm.heapsize}<\/code><\/li>\n\n\n\n<li><em>The race is won:<\/em>&nbsp;<code>zygote<\/code>&nbsp;reads the&nbsp;<code>dalvik.vm.heapsize<\/code>&nbsp;property which contains the correct value,&nbsp;<code>576m<\/code><\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Making_sure_we_always_win_the_race\"><\/span>Making sure we always win the race<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Now that we understand what is going wrong we need to figure out a fix. Fortunately it ends up being very simple in our case since we have root inside the emulator and can change any files we\u2019d like: just statically configure\u00a0<code>dalvik.vm.heapsize<\/code>\u00a0in\u00a0<code>\/system\/build.prop<\/code>\u00a0instead of relying on it to come through as a kernel command line argument. This will be loaded early in the init process so that it\u2019s present before\u00a0<code>zygote<\/code>\u00a0launches and needs it. It\u2019s important to make sure that this value matches the value the emulator is setting on the kernel command line, otherwise you can still encounter some inconsistency if the emulator wins the race and resets the value to something different.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How we found a race condition in the AOSP with the Android emulator that affected the amount of heap space available to apps. Rainforest supports testing native mobile applications on Android using the official android emulator from Google. Using emulators instead of real physical devices provides a bunch of benefits such as being able to [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"categories":[6],"tags":[],"class_list":["post-359","post","type-post","status-publish","format-standard","hentry","category-engineering"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/posts\/359","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/comments?post=359"}],"version-history":[{"count":3,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/posts\/359\/revisions"}],"predecessor-version":[{"id":994,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/posts\/359\/revisions\/994"}],"wp:attachment":[{"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/media?parent=359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/categories?post=359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rainforestqa.com\/blog\/wp-json\/wp\/v2\/tags?post=359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}