Newsletter #2

Since the last newsletter, we have been working on a language to perform searches on the ONYPHE search engine. It has been put in production last week, that is, end of april. Thanks to that, new APIs are now also possible and have been made available.

But there is more, we have enriched threatlist and inetnum categories of data by adding geolocation information. We also added subnet information for any single IP address to synscan and datascan categories of data. It will make it easy to pivot on a host and find anything else on the same subnet.

Finally, we industrialized Dark Web scanning and added two new protocols we are now watching: RDP (Remote Desktop Protocol) and DNS (Domain Name System). This makes a total of 13 protocols we are able to identify, whatever the listening port.

ONYPHE query language

We are in the last mile regarding the launch of our commercial offer. We had to finalize the query language, or the filters, before being able to sell the service. This is now done but not yet accessible to the mass (be patient).

This language allows users to query ONYPHE data with filters and CIDR mask, for instance. It will be as easy as typing key:value and keeping adding these filters to just get the information you want.

Another special filter is the category one. By default, ONYPHE will search for the datascan category (application data), but you may want to search for resolver data (passive DNS) or pastries (like pastebin).

Sample queries:

category:datascan product:Apache port:443 os:Windows
category:synscan port:23 country:FR os:Linux
category:synscan ip: os:Linux port:23
category:inetnum organization:"OVH SAS"
category:inetnum netname:APNIC-LABS
category:threatlist country:RU
category:threatlist ip:
category:pastries ip:
category:resolver ip:

With a full access, it is nearly 70 filters that are accessible. You may use these filters from the Web site on the search line or from the API, as described on the dedicated documentation. We will document this language more thoroughly when it will be available for end-users.

Geolocation enrichment for threatlist and inetnum categories

Starting from february 2017, we added geolocation information to threatlists. It was not done before because threatlists we aggregate were not giving this information. We considered that it may make sens, that if an IP was classified as a threat and its geolocation changed from one day to another, it may not be harmful anymore. We let the user decide how to consider that new information.

The same is true for inetnum: it is good to get these netblocks from RIR, but only the country is given. By adding geolocation information, we can enrich it with organization and GPS coordinates, for instance. Adding organization allows to perform such a thing as a netname to organization lookup (or the reverse).

Both of these enrichments are now readily available on any new data.

Subnet enrichment

When searching for information about your own IP addresses, you may find yourself in the situation where you want to find everything on your complete subnet. For that to work, the subnet information has to be put somewhere. This is now done, subnet information is added to synscan and datascan, and every category where geolocation is applied.

Thanks to that, you will be able to pivot on ip or subnet data by a simple click (or query filter) when the commercial offer we be available.

Scanning the Dark Web

Another addition is the scanning of the so-called Dark Web. Those .onion Web sites reachable only from the Tor network. We have compiled a first pass list of nearly 40,000 onion sites. Thanks to that list, we will be able to crawl the Dark Web and enrich this list by discovering new onion links, just like any search engine.

At the time of writing and taking into account this list, we have indexed more than 5,100 active hidden sites.

Note: don’t try the displayed search query as it is only available for ONYPHE purposes.

New watched protocols and fingerprinting

Finally, we added two new protocols along with fingerprinting of services: RDP (Remote Desktop Protocol) and DNS (Domain Name System). For RDP, we are able to differentiate between the Microsoft implementation and the XRDP one. That’s a start and should be very helpful. Thanks to that, we can enrich the information with the os.

For the DNS protocol, we simply use the version.bind request. And here is the TOP10 product in use on the Internet, being a resolver or authoritative server. The percentage is about this TOP10 only, not about all detected servers. Thus, BIND accounts for 78% of the TOP10 products discovered on the Internet.


As you can see, we have many new addition to share with you in this newsletter. The next time we share something with you will be the final pricing and the opening of commercial subscriptions.

In the meantime, for those not already registered, you can create your free user account and gain access to your API key by registering here:

Newsletter #1

We have been working on onyphe portal to bring new additions.

One of them is the preparation of the commercial launch of the service along with the user API and the other one is the addition of an abuse field for some categories of data.

The pricing model will be disclosed at a later time, when we will be ready to launch the commercial service.

Abuse email address field added to inetnum category

Following a request from a user of the service, we have added the extraction of abuse email addresses from RIR data (RIPE, for instance). You will be able to lookup abuse email addresses for a given IP address. The field’s name is “abuse”.


Of course, this field is now available from the inetnum API. It may be composed of multiple addresses, so expect it to be a multi-valued one.

Should you have requests for addition, you can reach us at support[at]

Limitation of the number of requests

We are working on the capability to sell the service, and before we go to the market, we have to be able to limit the number of queries a user can do on a monthly-basis.

For now, it is set to 0, meaning it is still unlimited. We will activate the limitation on the number of queries when we are ready to launch the commercial service.

New API: user

The user API gives you information about your user account. For instance, you will be able to make a free query to know how much credits are remaining.

The field giving this information is named “credits”. More information about this new API is available at the documentation page.


You can test the service for free, just register to get access to your free API and receive updates via our newsletter:

Samba Internet Exposure

Back in november 2017, a number of security vulnerabilities were disclosed impacting numerous versions of Samba software. CVE-2017-14746 is about a use-after-free issue while CVE-2017-15275 leads to a memory leak vulnerability. The former impacts all Samba versions starting from 4.0.0 while the later affects all versions starting from 3.6.0. Now, the question we may ask is: how many of this affected products can be reached from the Internet?

Samba Exposure

This question is important because, if successfully exploited, these issues may lead to the compromission of affected devices with, as a potential result, new hosts joining yet-another-botnet. By performing a simple search on ONYPHE with the string “samba”, we find around 1 million results.

The next obvious question is now: how many of these hits are using a vulnerable version of Samba? By querying for the TOP 10 versions of Samba, we obtain the following results:

80% of the TOP 10 versions are running vulnerable versions of Samba. That means a little bit more than 37,000 devices may be at risk of compromission.

Note: these results were collected at the end of November 2017.

We were specifically searching for Samba 3.6.x and 4.x. Now, those versions may not be the most prevalent on the Internet, so what about querying for the most seen results for a Samba query to list available shares? We can do that by querying for TOP 10 MD5 sums performed against collected banners.

Our results shows that only two MD5 sums are accounting for roughly 600,000 devices. For instance, if you query one of these sums, you will find more than 300,000 results: 

In fact, if you check for distinct IP addresses resulting from those two hashes, you will find around 300,000 unique addresses. That’s because those devices are exposing Samba through both ports 139/tcp and 445/tcp.

They are all Samba 3.2.15 hosted at Emirates Telecommunications Corporation organization. It is the exact same product behind this Samba version: D-Link DIR850L. The good news is it is not impacted by the previously discussed CVEs. Unfortunately, if you search for vulnerabilities impacting this given product, you find a blogpost dating back from Septembre 2017 describing a fair number of issues:


The results shown here were presented at the latest Botconf security conference in Montpellier, France during a lightning talk. We showed that Samba is quite heavily exposed on the Internet and may be abused to build a botnet, just like many other vulnerable products.

If you are interested in querying our data, you can register for free to get your API key and have access to ONYPHE queries.