The web application will have a database that consists of millions (over 20mil ... maybe 100mil or more) of hosts (ipv4,ipv6 and domains) and additional data to each host like blacklist, uptime, history, geo data (country, region, city, isp). I could, of course, use other RMDBS system like MySQL, Oracle, PostgreSQL ... but i want speed and I will have a lot of data. I stumbled upon at NoSQL and MongoDB and want to know what are the best way to develop and manage a big database.
I was thinking about using b-tree search on hosts but i will have to sort them and have 4 columns for ipv4 and 8 columns on ipv6. On domains i don't know how i would manage to do that ... i might just use full text search, maybe just a sort will help me.
Database will be constantly updated and used so a clever lock per row must be enabled. Main searches will be by host, geo code, isp, tags and hostId.
hosts table
hostId , host (must be either text or multiple columns) , tags (array) , blacklists (array) , uptime , geoCountry , geoRegion , geoCity, geoIsp , dateInserted
There is also a possibility that database might have a lot of offline hosts and multiple records per host so another history table must be created and whenever a host goes offline or a change occures (blacklist,geo ip data), must go in history.
hosts_history table
hostHistoryId , host (must be either text or multiple columns), tags (array) , blacklists (array) , uptime , geoCountry , geoRegion , geoCity, geoIsp , dateInserted
What would be the best approach and how would can I optimize this project for the best speed as possible?