Want To Do More In-depth Load Optimization? We Get Help From One Developer

Want To Do More In-depth Load Optimization? We Get Help From One Developer

Introduction

Whether it is an engine development team or a game development team, the importance of optimization is self-evident. This time, “Yuefu Elementary School Students” from Yuefu Interactive Entertainment achieved more in-depth loading optimization by modifying the engine source code in the actual project development.

There was a famous saying in the game world: “Third-rate games do functions, second-rate games do performance, and first-rate games do optimization.” Although it’s a bit ridiculous, it’s not entirely unreasonable. At least, it shows the importance of optimization in making games. This article will share how Yuefu improved loading optimization based on a project example.

The engine version used in this article is Cocos Creator 2.4.6.

1. Original reproduction

The loading process of Cocos Creator

The above is the loading process of loadRes, and the key steps are described as follows:

  • url transform: mainly converts the project path address/UUID into the corresponding actual resource address.

  • load res: It is mainly the IO process of the file and converts the loaded resources into corresponding JSON objects or binary arrays.

  • parse: mainly parses the loaded resources into corresponding objects.

  • depends: Get the dependencies of the current resource, and then continue to call the beginning steps to load.

Dissecting the loading flow of Prefab

The left side of the above process clearly shows the loading pipeline of Cocos Creator. It is known from the engine source code that the process from url transform to depends can be inserted into the custom pipeline, which has better flexibility and scalability.

The right part is the loading process of the cc. Spriteframe resource. To show the difference, we compare it with the CCSprite loading in Cocos2d-x:

It is not difficult to see that creating a Sprite in Cocos Creator requires two more processes than Cocos2d-x. In terms of IO times, Cocos Creator requires two more IOs (SpriteFrame configuration and Texture2d configuration) than Cocos2d-x when loading a single texture. So are these two configurations necessary?

The answer has to start with the characteristics of Cocos Creator itself:

  1. SpriteFrame configuration file (from now on referred to as [Configuration 1]): an independent JSON file used to store information such as the 19th palace and the texture size offset. It allows more flexibility in customizing the texture to modify the boundaries, etc. Corresponds to the information in the following properties panel.

TIPS: The configuration in the Cocos2d-x period is saved in the configuration file generated by the corresponding UI editor. Other resources that are not referenced by the interface need to be configured in the code.

  1. Texture2d configuration (from now on referred to as [Configuration 2]) mainly defines texture-related properties.

image

The above figure shows that two property configurations (WarpMode and FilterMode) will make us more flexible in using images and modifying the configuration.

To sum up, two additional configurations in the Cocos Creator loading process are necessary. So is there room for optimization in efficiency?

Choose Inline or Merge

There are merge options for texture configuration on the official build release interface:

The official documentation explains it as follows:

Inline all SpriteFrames

When automatically merging resources, merge all SpriteFrames and dependent resources into the same package. It is recommended for the web platform. After enabling, the total package size will be slightly increased, consuming a little more network traffic, but can significantly reduce the number of network requests. It is recommended to close the native platform because it will increase the size of the hot update.

Merge SpriteFrames in atlas

Combine all SpriteFrames in the atlas into the same package. It is disabled by default. When enabled, it can reduce the number of SpriteFrame files that need to be downloaded during a hot update, but if the number of SpriteFrames in the atlas is large, it may slightly extend the startup time on the native platform.

If there are many atlases in the project, it may cause the project.manifest file to be too large. It is recommended to check this option to reduce the size of the project.manifest file.

Note: During the hot update, it is necessary to ensure that this function’s on/off status in the old and new projects are consistent. Otherwise, resource reference errors will occur after the hot update.

The simple explanation is:

  • Inline: Merge the JSON file [Configuration 1] corresponding to SpriteFrame into the prefab.

  • Merge Atlas: Combine all SpriteFrames in the automatic atlas into one file, similar to TexturePacker’s plist file.

The advantages and disadvantages of each are described in detail in the official documentation. So is there a solution that can improve the loading efficiency without affecting the startup speed?

3. The Answer

The solutions adopted in this project are:

  1. Merge all SpriteFrame configurations to reduce IO.

  2. Convert the merged configuration into a binary file for faster startup.

SpriteFrame configuration optimization

The following is the SpriteFrame configuration information. Only “e8Ueib+qJEhL6mXAHdnwbi” (dependency) and the data area in the middle are different:

[
  1,
  [
    "e8Ueib+qJEhL6mXAHdnwbi"
  ],
  [
    "_textureSetter"
  ],
  [
    "cc.SpriteFrame"
  ],
  0,
  [
    {
      "name": "default_btn_normal",
      "rect": [
        0,
        0,
        40,
        40
      ],
      "offset": [
        0,
        0
      ],
      "originalSize": [
        40,
        40
      ],
      "capInsets": [
        12,
        12,
        12,
        12
      ]
    }
  ],
  [
    0
  ],
  0,
  [
    0
  ],
  [
    0
  ],
  [
    0
  ]
]

Solution

  1. The same part is defined in the code as a template (reduce redundant data), extract all the different parts and merge them into the same file to form the following configuration:
{[
{
      "name": "default_btn_normal",
      "rect": [
        0,
        0,
        40,
        40
      ],
      "offset": [
        0,
        0
      ],
      "originalSize": [
        40,
        40
      ],
      "capInsets": [
        12,
        12,
        12,
        12
      ],
      "depend": "e8Ueib+qJEhL6mXAHdnwbi" // 额外加入字段
 },

...

 ],
 [uuid1,uuid2,...] // Add additional fields for the uuid of the file, keeping the same order as above
}
  1. Convert the file into binary format, effectively reducing the file size, improving the initialization speed, and reducing data and field redundancy. It is recommended to use flatbuffers for binary solutions. For details, please refer to online tutorials or official documents.

  2. Take over the game download process to ensure that the files are read normally.

  3. Take over IO: Modify the builtin/jsb-adapter/engine/ jsb-fs-utils.js file and add the following:

setJsonReadHandler(handler) {
        fsUtils._customJsonLoadHandler = handler
    },
    readJson (filePath, onComplete) {
        let jsonLoadhandler = fsUtils._customJsonLoadHandler
        if (jsonLoadhandler && jsonLoadhandler(filePath, onComplete)) {
            return
        }
        fsUtils.readFile(filePath, 'utf8', function (err, text) {
            var out = null;
            if (!err) {
                try {
                    out = JSON.parse(text);
                }
                catch (e) {
                    cc.warn(`Read json failed: path: ${filePath} message: ${e.message}`);
                    err = new Error(e.message);
                }
            }
            onComplete && onComplete(err, out);
        });
    },

Note: This is the modified part is for the native side. A custom loading pipeline can process the web side.

  1. Data restoration: Restore the SpriteFrame format through template data and binary data. The data area here can be stored as a flatbuffers object and then parsed where it is used:
[
  1,
  [
    "e8Ueib+qJEhL6mXAHdnwbi"
  ],
  [
    "_textureSetter"
  ],
  [
    "cc.SpriteFrame"
  ],
  0,
  [
    // flatbuffer object
  ],
  [
    0
  ],
  0,
  [
    0
  ],
  [
    0
  ],
  [
    0
  ]
]
  1. Modify the CCSpriteframe.jsfile , modify the analysis:
_deserialize: function (data, handle) {
        if (!CC_EDITOR && data.bb) {
            this._deserializeWithFlatbuffers(data);
            return;
        }
        ...
}

Texture2d configuration optimization

Texture2d is configured as follows:

[
  1,
  0,
  0,
  [
    "cc.Texture2D"
  ],
  0,
  [
    "0,9729,9729,33071,33071,0,0,1",
    -1
  ],
  [
    0
  ],
  0,
  [],
  [],
  []
]

Compared with the SpriteFrame configuration, Texture2d’s configuration is much simpler, and the property values in it are mainly related to the property panel and file extension. If the properties of the images are all default and the extensions are the same, the Texture2d configuration is exactly the same. That is, if there are 200 image resources in the project, the configuration files of the 200 images are precisely the same.

Solution

Compare all Texture2d configuration files through md5, extract different files, and generate corresponding configuration maps for fast reading. Take my current project as an example: there are 9000+ image resources, and there are only five types in the final comparison, so these five configurations are directly written in the code, and the corresponding configuration is also returned in the above takeover process information.

Before and after optimization, the loading speed of the iPhone 6 test increased by about 43% :

Texture2d loading process optimization

In the native texture loading process, the texture data is converted into ArrayBuffer, passed to JavaScript, then reassembled at the JavaScript layer and returned to the C++ layer. There are two data transfer processes here. The process is as follows:

Optimization direction: After the loading is completed, the native layer is in place in one step. Directly create a Texture2d object and return it, reducing the intermediate data transfer process. The revised process is as follows (the red box is the omitted part):

PHOTO

Note: After modifying the above process, the dynamic composite image on the native side will not be available. However, most native development uses compressed textures, which do not support dynamic combinations. Therefore, the problem of the dynamic combination of pictures can be ignored entirely.

The code is modified as follows:

C++ part:

cocos2d-x/cocos/scripting/js-bindings/manual/jsb_global.cpp

if (loadSucceed)
{
  se::Object* retObj = se::Object::createPlainObject();
  retObj->root();
  refs.push_back(retObj);
  cocos2d::renderer::Texture2D* cobj = new (std::nothrow) cocos2d::renderer::Texture2D();
  auto obj = se::Object::createObjectWithClass(__jsb_cocos2d_renderer_Texture2D_class);
  obj->setPrivateData(cobj);
  cocos2d::renderer::Texture::Options options;
  options.bpp = imgInfo->bpp;
  options.width = imgInfo->width;
  options.height = imgInfo->height;
  options.glType = imgInfo->type;
  options.glFormat = imgInfo->glFormat;
  options.glInternalFormat = imgInfo->glInternalFormat;
  options.compressed = imgInfo->compressed;
  options.hasMipmap = false;
  options.premultiplyAlpha = imgInfo->hasPremultipliedAlpha;
  std::vector<cocos2d::renderer::Texture::Image> images;
  cocos2d::renderer::Texture::Image image;
  image.data = imgInfo->data;
  image.length = imgInfo->length;
  images.push_back(image);
  options.images = images;
  cobj->initWithOptions(options);
  retObj->setProperty("texture", se::Value(obj));
  retObj->setProperty("width", se::Value(imgInfo->width));
  retObj->setProperty("height", se::Value(imgInfo->height));
  seArgs.push_back(se::Value(retObj));
  imgInfo = nullptr;
}

JS code modification:

builtin/jsb-adapter/builtin/jsb-adapter/HTMLImageElement.js

set src(src) {
 this._src = src;
 jsb.loadImage(src, (info) => {
    if (!info) {
        this._data = null;
        return;
    } else if (info && info.errorMsg) {
        this._data = null;
        var event = new Event('error');
        this.dispatchEvent(event);
        return;
    }
    this.width = this.naturalWidth = info.width;
    this.height = this.naturalHeight = info.height;
   if (info.texture) {
        info.texture._ctor()
        this.texture = info.texture
    }
    else {
         ...
    }
    this.complete = true;
    var event = new Event('load');
    this.dispatchEvent(event);
});
}

engine/cocos2d/core/assets/CCTexture.js

_nativeAsset: {
    get () {
        // maybe returned to pool in webgl
        return this._image;
    },
    set (data) {
        if (data.texture) {
            this.initWithTexture(data.texture, data.width, data.height)
            return
        }
        ...
    }
},
// Add the following function
initWithTexture (texture, pixelsWidth, pixelsHeight) {
    this._texture = texture
    this.width = pixelsWidth;
    this.height = pixelsHeight;
    // Notify the native side to update the configuration, if there is no modification of the texture property, the code basically does not run.
    // The _updateNative flag logs when the current object is serialized and is true if the information in the configuration does not match the default value.
    if (this._updateNative) {
        var opts = _getSharedOptions();
        opts.minFilter = FilterIndex[this._minFilter];
        opts.magFilter = FilterIndex[this._magFilter];
        opts.wrapS = this._wrapS;
        opts.wrapT = this._wrapT;
        texture.update(opts, true) // Here you need to add a simple update function on the native side. Just take the original update function to raise the texture data, so I won't post it here.
    }

    this.loaded = true;
    this.emit("load");
    return true;
},

Before and after optimization, the loading speed of the iphone6 test is increased by about 12%-15%:

PHOTO

The above statistics are the data before and after the Prefab is loaded, including the time to load the texture asynchronously, so there will be a long time. Still, the time-consuming synchronization is gone, and the iphone6 has no noticeable lag.

Additional questions

Spine loading optimization

Since the spine’s skeletal animation is loaded separately on the native side, the spine bone loading can be removed when js is loaded, reducing one IO.

Modify the file as follows:deserialize.js

function deserialize (json, options) {
    ...    
    // Not the native side or not the skeleton file, spine native side does not load the skeleton file
    asset._native && (asset.__nativeDepend__ = !CC_JSB || !(asset instanceof sp.SkeletonData));
    pool.put(tdInfo);
    return asset;
}

Path Search (fullPathForFilename)

Because the first time the path is filled, it needs to be searched from all the paths. From the test on Xiaomi 5, it is found that each path check takes about 2ms. We usually will have two paths: an update path and a current package path. Therefore, it takes at least 4ms+ for the last file retrieval of Xiaomi 5.

Solution:

Generate a path mapping table yourself because the files are determined when packaging and updating. In this way, file search speed can be reduced to less than 50μs.


This article mainly wants to share an idea and direction of loading optimization for everyone. Interested
developers can go to the forum post to exchange and discuss together: