Major changes for historical queries and roadmap 2023
Written on 2023/01/16
first of all, we wish you and your beloved ones a wonderful 2023 year, may it be the year of all fulfillment in your lives.
Please read the below message carefully as we announce major changes regarding our historical data storage scheme. We have worked hard to make sure all your use cases will be covered after this change by reviewing all your queries, but feel free to reach us at support at onyphe.io so that we can analyze them on a case-by-case basis.
For raw data customers, there is absolutely no changes.
A new license is born
Griffin View is the latest license available. This license has benefits over the Eagle View as follows:
- New categories dedicated to DNS enumeration process: hostname, domain & ip. Hostname is storing 12-month of historical data for every FQDN we have seen somewhere in our collected data. Domain category lists all domain names we have seen in the same way. Ip category relates to live IP addresses we have seen. For instance, we have more than 165 million domains listed, 1 billion hostnames listed & 337 milllion unique IP addresses which have been seen live in the last 12-month;
- New Discovery API perfectly suited to query in bulk volumes. You have 10th, 100th of 1000th of unique domain names in your perimeter? You can launch bulk searches on such volume in a single query, making it faster and easier to collect data on your exposed assets;
- 12-month of historical data for hostname, domain, ip & vulnscan categories;
- New APIs will be added for high-level use cases like automating asset inventory or identifying risks on exposed assets.
Our pricing is public: pricing.
Major historical queries changes
In the coming months, we will roll a major update on our datamodel design. For now, we have kept 7-month of historical data with the same datamodel. To pursue our quest for longer historical data and improve our collecting coverage, we will change that model.
The easy way would have been to add more storage capacity, but we already rolled some changes impacting size of stored data. As we don’t want to reflect that increase on license prices, we searched for a smarter way of achieving our goals. Furthermore, you are aware of increase in energy prices in Europe, and our hosting providers reflected that change on their prices by up to 15%. This had an impact in our costs too, and we decided not to reflect that price on our licenses.
By changing this model, we will free up-to 50% of our storage, which will be converted to globally better coverage and more frequently refreshed data. Another advantage will to deploy scanners in other parts of the world, like Asia and US. In the past, we were focused on deploying scanners in Europe and Canada.
Please be assured that this change is focused on improving our customer’s user experience and use cases. We wouldn’t have designed this new scheme if no benefits were added for you, our customers.
- Better refresh rate for data. Currently, we refresh most data on a per-month basis. We want to refresh data twice a month instead of once a month currently;
- Scanning for IPv6 in datascan and scanning from different parts of the world. With these changes, we will have scanners deployed in Europe, Canada, Hong-Kong & US. In fact, we have already started deploying them. The goal is to have a view from exposed assets from different parts of the world. Of course, location of the scanner that have seen data is stored in the datamodel with the new node.physicalcountry field.
- Full-text searches for historical data on datascan category will only be possible for first 4KB of raw responses instead of 16KB (this increased data size field change was introduced back in October 2022);
- Only country level precision for geolocation;
- Less fields will be available for filtering data on historical queries.
Impacted categories of information for data older than 30-days (default query time-range):
Other categories are not subject to any change.
To sum it up, here are the fields that will be kept for historical data (older than 30-days):
- synscan: @timestamp, ip, ipv6, port, osvendor, os, tag, cpe, country, organization, geolocus.country, geolocus.organization, geolocus.domain;
- resolver: @timestamp, ip, ipv6, type, ns, mx, soa, spf, txt, cname, data, datamd5, tag, hostname, host, subdomains, domain, tld, country, organization, geolocus.country, geolocus.organization, geolocus.domain;
- ctl: @timestamp, fingerprint.sha256, ip, serial, tag, issuer.organization, subject.altname, subject.commonname, subject.organization, subject.email, validity.notafter, validity.notbefore, hostname, host, subdomains, domain, tld;
- datascan: @timestamp, ip, ipv6, port, forward, url, protocol, tag, tls, transport, data (4KB instead of 16KB), datamd5, datammh3, country, organization, geolocus.country, geolocus.organization, geolocus.domain, issuer.organization, fingerprint.sha256, serial, subject.altname, subject.commonname, subject.organization, subject.email, validity.notafter, validity.notbefore, hostname, host, subdomains, domain, tld, osvendor, os, osversion, osversionpatch, productvendor, product, productversion, productversionpatch, app.http.component.productvendor, app.http.component.product, app.http.component.productversion, app.http.component.productversionpatch, cpe, cve, device.class, device.productvendor, device.productversion, device.productversionpatch, app.ftp.anonymous, app.smb.nullsession, app.http.title, app.http.headermd5, app.http.headermmh3, app.http.bodymd5, app.http.bodymmh3, app.http.tracker.ga, app.http.tracker.gaw, app.http.tracker.gtm, app.http.tracker.gpub, app.http.tracker.fbq, app.http.tracker.snaptr.
- Improved scanning for favicons;
- Improved scanning for .js files;
- IPv6 scanning for datascan, currently only synscan category is IPv6 enabled;
- Better coverage for Web sites. Goal is to refresh all Web sites root directories once a month;
- Twice a month refresh rate for datascan category;
- Even more historical data for specific categories for Griffin Views (hostname, domain, ip & vulnscan);
- On-demand scanning capabilities, subject to a different license than the public ones;
- Switching to our own geolocation database dubbed geolocus (https://www.geolocus.io). This geolocation service is at the country-level precision, not at city-level. Thus, city field will be discontinued;
- Inventory & Audit APIs for Griffin Views;
- A brand new Web site.
Again, rest assured that we keep pursuing our goal to have the most comprehensive data on exposed assets for our customers. We created ONYPHE with a simple goal: having a better view than bad guys on your exposed assets to help you fix issues before they are exploited.
From a scheduling perspective, we will roll this change over the course of the coming months by aligning with our current storage capacities. We will migrate historical data monthly to stay at the maximum we can do storage-wise.
Please reach us at support at onyphe.io so that we can review all your use cases and make sure we haven’t missed any of them with this new major change in historical data storage.