It's fun to look at some of these patents where you know that a good part of them have been implemented. Not everything that they discuss always makes it in, and some stuff gets added as the processes in a patent are developed.
I saw the three developers from Switzerland. Nice international effort there.
The access logs that they are talking about are from the website where the sitemap is being generated. From earlier on in the patent application:
[0073] The sitemap generator 106 generates sitemaps by accessing one or more sources of document information. In some embodiments, the sources of document information include the file system 102, access logs, pre-made URL lists, and content management systems. The sitemap generator 106 may gather document information by simply accessing the website file system 102 and collecting information about any document found in the file system 102. For instance, the document information may be obtained from a directory structure that identifies all of the files in the file system, or in a defined portion of the file system.
[0074] The sitemap generator 106 may also gather document information by accessing the access logs (not shown) of the website. The access logs record accesses of documents by external computers. An access log may include the URLs of the accessed documents, identifiers of the computers accessing the documents, and the dates and times of the accesses. The sitemap generator 106 may also gather document information by accessing pre-made URL lists (not shown). The pre-made URL lists list URLs of documents that the website operator wishes to be crawled by web crawlers. The URL lists may be made by the website operator using the same format as that used for sitemaps, as described below
I'm not sure if using document popularity to help determine crawler priority makes sense. It's possible that the more popular pages are the ones that may be indexed by search engines already.
To help you rank, hurt rank or both!!
That's what I was thinking, too.