Security News > 2023 > May > DarkBERT could help automate dark web mining for cyber threat intelligence

DarkBERT could help automate dark web mining for cyber threat intelligence
2023-05-19 10:02

Researchers have developed DarkBERT, a language model pretrained on dark web data, to help cybersecurity pros extract cyber threat intelligence from the Internet's virtual underbelly.

A team of researchers from Korea Advanced Institute of Science and Technology and data intelligence company S2W has decided to test whether a custom-trained language model could be useful, so they came up with DarkBERT, which is pretrained on dark web data.

DarkBERT has undergone extensive pretraining on texts in English - approximately 6.1 million pages found on the dark web.

There are many dark web forums and a huge number of forum posts, and being able to automate discovery and evaluation of the noteworthiness of threads could significantly reduce their workload. Again, the main problem is the specific language used on the dark web.

"Nevertheless, the performance of DarkBERT over other language models shown here is significant and displays its potential in dark web domain tasks. By adding more training samples and incorporating additional features like author information, we believe that detection performance can be further improved."

Researchers found that DarkBERT outperforms other pretrained language models in all the tasks is has been presented with, and concluded that it"Shows promise in its applicability on future research in the dark web domain and in the cyber threat industry," though more work and fine-tuning is required to make it more widely applicable.


News URL

https://www.helpnetsecurity.com/2023/05/19/cti-dark-web/