Sørensen Dice Coefficient Algorithm Project
Description
Using the Sørensen-Dice coefficient algorithm (https://en.wikipedia.org/wiki/Dice%27s_coefficient)
You must create a combosquatting detector that aims to find the similar names % rate of a domain and compare to the other domain with in the domain lists provided from a txt file.
You must be able to insert the domain in mind and then its compared to a list of other domain that are in .txt file. after that it must print the dice similarity rate and the string it has been compared to. The required to be done is to drop the tld; that is the www and the .com or .net or any other thing and compare what is in between. Example: www.xxx.com I want to compare xxx with the other xxx from the list of websites, www.xxx.net or www.cxx.com with www.xxx.com and like that. After that it must save the result into a new txt file.
the result should be something like: www.xxx.com and www.xxx.net has a similarity rate of: xx% and so on.
this implementation should be in parallel and in single python documented commented file. You must not use any API and instead implement the Sørensen-Dice coefficient algorithm directly.
You must check first which of the domians are less in length and then jump into the compare, just so that is not so expensive when dealing with huge list.
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."