Deep dive: Sync download optimization
Before and after
Let's start with an example of what can be achieved by proper download setup. It uses a desktop client and an extremely large database.
Here is a short sync summary using an out-of-the-box setup:
<Summary> <Settings App='126.96.36.199' Server='Crm2011.13/Federation' UserMode='Standard' Threads='3'/> <Results Sent='0' Recv='23931119' Result='Normal'/> <Times Total='17681922ms' Prepare='31ms'/> <FullSync Recv='23931119' TotalTime='17681891ms' Entitys='17261688ms' ManyToMany='256140ms'> <SyncDownloader CacheSize='250MB' UsedCache='258MB' PausedFor='61%'/> <Cleanup RecsDeleted='0' RecordCleanup='163735ms' /> </FullSync> <DownloadPerformance>0.73 ms/rec (good)</DownloadPerformance> <ApiCalls Downloader='43629' NNDownloader='869' Attachments='1' Total='44499' /> </Summary>
And now the results achieved using the maximal possible cache, large page size, and 4 threads. Sync is 2.8x faster while using only 15% of the API calls:
<Summary> <Settings App='188.8.131.52' Server='Crm2011.13/Federation' UserMode='Standard' Threads='4'/> <Results Sent='0' Recv='23931119' Result='Normal'/> <Times Total='6370422ms' Prepare='31ms'/> <FullSync Recv='23931119' TotalTime='6370313ms' Entitys='5971062ms' ManyToMany='256032ms'> <SyncDownloader CacheSize='1800MB' UsedCache='1813MB' PausedFor='54%'/> <Cleanup RecsDeleted='0' RecordCleanup='142937ms' /> </FullSync> <DownloadPerformance>0.26 ms/rec (exceptional)</DownloadPerformance> <ApiCalls Downloader='5856' NNDownloader='869' Attachments='1' Total='6726' /> </Summary>
To start the analysis, run the synchronization and check the sync log. Since release 16.0, the sync log includes hints about configuration changes that can increase performance. If you enable Diagnostic Sync Logs, you will see statistics about the download of each entity.
<Downloader CacheSize=250MB UsedCache=251MB PausedFor=40%/> 14:56:16 xxx_country: 255 recs, 172ms 14:56:16 businessunit: 611 recs, 422ms 14:56:16 systemuser: 18371 recs, 10984ms, 296ms/270K/page | Increase page (now 500) 14:56:16 team: 77187 recs, 49844ms, 321ms/314K/page | Increase page (now 500) 14:56:16 xxx_documenttype: 87 recs, 141ms 14:56:16 xxx_documentsubtype: 86 recs, 109ms 14:56:16 xxx_locationlevel: 403423 recs, 232172ms, 287ms/379K/page | Increase page (now 500) 14:56:27 xxx_educationlevel: 21 recs, 141ms 14:56:27 xxx_exitentrypoint: 2717 recs, 1343ms, 223ms/252K/page | Increase page (now 500) 14:56:28 xxx_ethnicity: 752 recs, 391ms 14:56:29 xxx_occupation: 743 recs, 422ms 14:56:29 xxx_processdetailcode: 32 recs, 109ms 14:56:29 xxx_owningoffice: 516 recs, 297ms 14:56:30 xxx_reception: 34 recs, 266ms 14:56:30 xxx_assistancelocation: 3558 recs, 1594ms, 199ms/192K/page | Increase page (now 500) 14:56:31 xxx_registrationgroup: 328500 recs, 293422ms + pause 83812ms, 446ms/504K/page | Increase page (now 500) <<<< SAVING 14:57:06 xxx_connectionroleinformation: 102 recs, 125ms 14:57:06 xxx_religion: 50 recs, 109ms 14:57:06 xxx_individual: 60000 recs, 95251ms + pause 247812ms, 790ms/1970K/page | Increase page (now 500) | Aborting 15:00:09 xxx_document: 14500 recs, 18124ms + pause 142329ms, 614ms/633K/page | Increase page (now 500) | Aborting
Reading the log
Let's explain the details for systemuser entity (4th line):
- The download started at 14:56:16.
- 18371 user records were downloaded in a bit less than 11 seconds.
- Each request downloaded 500 records (page size). The request took 296 milliseconds on average (request duration). The response payload was ~270 KB.
- On top of that, we see the recommendation to increase the page size.
Entity xxx_document (last row) shows additional information:
- Entity download was paused for 142 seconds. (Because the cache was full.)
- Entity shows the keyword
Aborting, meaning the sync was interrupted, but this download thread remains waiting in the background until it receives response to the request being just executed.
- BTW, occasionally, this might take longer (up to 2 minutes). The app cannot be correctly terminated within this time frame.
Finally, xxx_registrationgroup entity displays the keyword
SAVING. This is the active entity being downloaded and stored in the database. (The other two entities being downloaded and stored into the cache at that moment were xxx_individual and xxx_document.)
Analyzing the log
As you see, we have plenty of material that can be used for analysis and eventual improvement of the download process.
Here are some general tips in this respect, first concerning the page size:
- On the positive side, a larger page size means fewer requests (API calls) and time savings due to smaller web latency.
- On the negative side, it means a larger response size, longer request duration, and worse app responsivity when aborting sync.
- Response size: As far as client performance is concerned, there should be no problem with even 10 MB response sizes. (Unless you use a very slow device.)
- Request duration: Large values normally indicate that the server has problems with fetch processing.
- Values >10 seconds are dangerous: If the server gets temporarily overloaded, the same request may take many times (even 10x) longer and eventually cause a timeout.
|Note||As a rule of thumb, avoid requests taking more than 10 seconds or larger than 10 MB.|
What can be deduced from our example?
- The largest page size (1970K) has the entity xxx_individual. Its records carry a lot of data. However, from the memory point of view, even this entity should cope with larger pages.
- The second largest page size is 504K.
- Conclusion: As far as client performance is concerned, we can afford to use a maximum page size (i.e., 5000) for all entities except xxx_individual, where a smaller value (2-3000) seems more appropriate.
What about the server impact?
- Downloading a single entity means executing a fetch. (FullSync fetches are defined in the sync filter). This, in turn, means executing an SQL query on the server and returning (page by page) all records matching that query.
- If multiple pages are needed, then the query reading must be restarted with each new page. Usually, this takes negligible time, but the difference can be substantial in some cases (typically for slow queries).
- The general conclusion is that a larger page size minimizes server load.
Is there anything else we could deduce?
- Well, the apparent topic is the cache size. If we see a large PausedFor value, we know that enlarging the cache size would help. Our example represents just such a case. Hence, if your client devices have enough memory, increase the cache to 1 GB or even more. (The more, the better.)
Download stats enable you to trace also other problems, such as entities that need to be prioritized. (There is no such entity in our example.)
Optimal number of download threads
Unfortunately, there is no clear answer about how many threads to use. More threads might speed up the sync - or cause larger pauses or even slightly decrease performance. Added threads may cause faster cache exhaustion.
The topic is very complex. Moreover, the answer might differ for FullSync vs. IncSync. Some general guidelines to follow:
- If all your entities are small ("small" means they fill only a tiny fraction of the cache), then more threads help. (This is the case of typical IncSync, for example.)
- If you have several large entities, adding more threads might have (marginal) adverse effects.
Below, you see a sample setup that employs several measures to improve sync performance.
<SESetup Version='1'> <SyncDownloaderSetup> <Setup Platform='Windows' DownloadPageSize='5000' DownloadCacheSize='1800' NumDownloadThreads='4'> <Entities> <Entity Name='xxx_document' DownloadPageSize='3000' /> <Entity Name='xxx_individual' DownloadPageSize='2000' /> <Entity Name='annotation' DownloadPageSize='500' HasPriority='true' /> </Entities> </Setup> </SyncDownloaderSetup> </SESetup>
- Increase the cache if you see download pauses and if the client devices have large enough memory.
- Increase the page size as long as the requests take <10 seconds and responses do not exceed 10 MB size.
- You may use different page sizes for different entities.
- Increase the download thread count unless you have several large entities.
- Prioritize the download of dominant entities.