Question:

What is the fastest way to determine whether an address exists on the blockchain?

Sophia: 25 May 2022

All I want is to determine whether a given address has ANY activity on the blockchain. I've seen projects like bitcoin-abe, which might work, but I wondered if there were a simpler way than using abe, or using a web service (I'll be querying too much for this option)?

Thoughts welcome and appreciated.

Answer:
Isabella: 25 May 2022

Using local blockchain.

If you are prepared to keep an up to date copy of the blockchain yourself (or if you already do) then I would expect it to be quicker to check locally yourself than to use a web service (which you seem to have decided for yourself already).

Using list of distinct addresses present on the blockchain.

If you are only looking for a True or False answer to the question "is this address present in the blockchain?" you could achieve faster responses by keeping an up to date list of distinct addresses rather than search the whole blockchain. Once the list is complete keeping it up to date only requires adding those addresses present in the latest block - a small task roughly every 10 minutes.

Using list of abbreviated addresses.

If your purpose permits a margin of error then you could improve speed further by using only the first n characters of the bitcoin address. Maintain an up to date list of distinct abbreviated length n addresses and abbreviate the candidate address to n characters before looking it up.

Even if you require very accurate results, using only half the characters in a bitcoin address will still give almost 100% accuracy, while giving a significant speed up.

Discounting most recent blocks.

If you don't need the most up to date information you could reduce the work required for keeping your abbreviated address list up to date. Only considering blocks more than m blocks deep will reduce the amount of redundant updates due to blocks that are later orphaned.

The improvements in speed described here will be irrelevant for most purposes unless you are processing a huge number of candidate addresses per second, but your question hints that you may be, for at least some of the time.