|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Help with encrypted web pages?a highly specialized personal list of sites and pages that I may like to see based on preferences that I have supplied. I have found some interesting pages - interesting in the fact that they use javascript to encrypt the pages to block people from ?stealing thier content?. There are javascript tricks that you can use on the downloaded encrypted page to get around these irritations. You have to run a javascript line in the browsers address line and you get another window with the unencrypted HTML in it. But, I want to see the HTML unencrypted without downloading every image, wav, activex object and flash thingy on the page into an actual webbrowser control. Utilizing a webbrowser control for this, and having to dl all images and such would dramatically decrease the speed the spider can crawl at. An example of an encrypted page can be found at http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way to defeat it is by pasting "javascript:window.open('about:blank').document.write('<pre>' + document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" in the IE address bar and clicing GO. I have no interest in (or drive space for) mass web page content theft. But, is there anything in the .Net framework that will help with viewing an encrypted web page's source for my spider? It seems I need to be able to run Javascript to decode the page into readable HTML.....but, as I may have said, I am only after the HTML....I really don't want to DL the pics and all that other stuff - it kills my speed. Any ideas? "smerf" <sm***@shroom.com> wrote in No, you'll need to run the javascript to decode it.news:zEaVg.41126$KR1.9960@bignews2.bellsouth.net: > But, is there anything in the .Net framework that will help with > viewing an encrypted web page's source for my spider? It seems I need > to be able to run Javascript to decode the page into readable > HTML.....but, as I may have said, I am only after the HTML....I really > don't want to DL the pics and all that other stuff - it kills my > speed. Hello smerf,
If you studied the js and did a lil investigation you could easily figure this out. Here's some of the guts to get you started: function p(y) { var d='',i,r,m,g; for(i=1;i<=y.length;i++) { r=y.charAt(i-1); m=b.indexOf(r); if(m>-1) { g=((m+1)%n-1); if(g<=0) { g+=n } d+=b.charAt(g-1) } else { d+=r } } k+=d }; function fff() { document.write(k);e="" } -Boo Show quoteHide quote > I am trying to write a personal spider to crawl through websites and > create a highly specialized personal list of sites and pages that I > may like to see based on preferences that I have supplied. I have > found some interesting pages - interesting in the fact that they use > javascript to encrypt the pages to block people from ?stealing thier > content?. > > There are javascript tricks that you can use on the downloaded > encrypted page to get around these irritations. You have to run a > javascript line in the browsers address line and you get another > window with the unencrypted HTML in it. But, I want to see the HTML > unencrypted without downloading every image, wav, activex object and > flash thingy on the page into an actual webbrowser control. Utilizing > a webbrowser control for this, and having to dl all images and such > would dramatically decrease the speed the spider can crawl at. > > An example of an encrypted page can be found at > http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way > to defeat it is by pasting > "javascript:window.open('about:blank').document.write('<pre>' + > document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" > in the IE address bar and clicing GO. > > I have no interest in (or drive space for) mass web page content > theft. > > But, is there anything in the .Net framework that will help with > viewing an encrypted web page's source for my spider? It seems I need > to be able to run Javascript to decode the page into readable > HTML.....but, as I may have said, I am only after the HTML....I really > don't want to DL the pics and all that other stuff - it kills my > speed. > > Any ideas? > Code is not what I need. I need a JavaScript Interpreter that I can include
in my app. Then I could rab the HTML (encoded or not) and run the javascript to decode the HTML page. But I can't find one that I can include with my app. Of course, even if I did this, there's also VBscript Encoding and God only knows what else to contend with...... I may just be relegated to doing it the slow way. Even then, it seems that there should be some way to tell webbrowser control to load a page into the DOM but not to retrieve images or audio or (*insert bandwidth wasting object name here*). Can you turn off DL everything but the text in a webbrowser control? I just want the unencrypted, unobfuscated HTML code to scan. Nothing else. Show quoteHide quote "GhostInAK" <ghosti***@gmail.com> wrote in message news:be1391bf1b53c8c8b67c5133dbf0@news.microsoft.com... > Hello smerf, > > If you studied the js and did a lil investigation you could easily figure > this out. > > Here's some of the guts to get you started: > > function p(y) { > var d='',i,r,m,g; > for(i=1;i<=y.length;i++) { > r=y.charAt(i-1); > m=b.indexOf(r); > if(m>-1) { > g=((m+1)%n-1); > if(g<=0) { > g+=n > } > d+=b.charAt(g-1) > } else { > d+=r > } > } > k+=d > }; > > function fff() { > document.write(k);e="" > } > > > -Boo > >> I am trying to write a personal spider to crawl through websites and >> create a highly specialized personal list of sites and pages that I >> may like to see based on preferences that I have supplied. I have >> found some interesting pages - interesting in the fact that they use >> javascript to encrypt the pages to block people from ?stealing thier >> content?. >> >> There are javascript tricks that you can use on the downloaded >> encrypted page to get around these irritations. You have to run a >> javascript line in the browsers address line and you get another >> window with the unencrypted HTML in it. But, I want to see the HTML >> unencrypted without downloading every image, wav, activex object and >> flash thingy on the page into an actual webbrowser control. Utilizing >> a webbrowser control for this, and having to dl all images and such >> would dramatically decrease the speed the spider can crawl at. >> >> An example of an encrypted page can be found at >> http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way >> to defeat it is by pasting >> "javascript:window.open('about:blank').document.write('<pre>' + >> document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" >> in the IE address bar and clicing GO. >> >> I have no interest in (or drive space for) mass web page content >> theft. >> >> But, is there anything in the .Net framework that will help with >> viewing an encrypted web page's source for my spider? It seems I need >> to be able to run Javascript to decode the page into readable >> HTML.....but, as I may have said, I am only after the HTML....I really >> don't want to DL the pics and all that other stuff - it kills my >> speed. >> >> Any ideas? >> > > Hello smerf,
A browser is really the only thing that will give you reliable results. -Boo Show quoteHide quote > Code is not what I need. I need a JavaScript Interpreter that I can > include in my app. Then I could rab the HTML (encoded or not) and run > the javascript to decode the HTML page. > > But I can't find one that I can include with my app. > > Of course, even if I did this, there's also VBscript Encoding and God > only knows what else to contend with...... > > I may just be relegated to doing it the slow way. Even then, it seems > that there should be some way to tell webbrowser control to load a > page into the DOM but not to retrieve images or audio or (*insert > bandwidth wasting object name here*). > > Can you turn off DL everything but the text in a webbrowser control? > > I just want the unencrypted, unobfuscated HTML code to scan. Nothing > else. > > "GhostInAK" <ghosti***@gmail.com> wrote in message > news:be1391bf1b53c8c8b67c5133dbf0@news.microsoft.com... > >> Hello smerf, >> >> If you studied the js and did a lil investigation you could easily >> figure this out. >> >> Here's some of the guts to get you started: >> >> function p(y) { >> var d='',i,r,m,g; >> for(i=1;i<=y.length;i++) { >> r=y.charAt(i-1); >> m=b.indexOf(r); >> if(m>-1) { >> g=((m+1)%n-1); >> if(g<=0) { >> g+=n >> } >> d+=b.charAt(g-1) >> } else { >> d+=r >> } >> } >> k+=d >> }; >> function fff() { >> document.write(k);e="" >> } >> -Boo >> >>> I am trying to write a personal spider to crawl through websites and >>> create a highly specialized personal list of sites and pages that I >>> may like to see based on preferences that I have supplied. I have >>> found some interesting pages - interesting in the fact that they use >>> javascript to encrypt the pages to block people from ?stealing thier >>> content?. >>> >>> There are javascript tricks that you can use on the downloaded >>> encrypted page to get around these irritations. You have to run a >>> javascript line in the browsers address line and you get another >>> window with the unencrypted HTML in it. But, I want to see the HTML >>> unencrypted without downloading every image, wav, activex object and >>> flash thingy on the page into an actual webbrowser control. >>> Utilizing a webbrowser control for this, and having to dl all images >>> and such would dramatically decrease the speed the spider can crawl >>> at. >>> >>> An example of an encrypted page can be found at >>> http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript >>> way to defeat it is by pasting >>> "javascript:window.open('about:blank').document.write('<pre>' + >>> document.documentElement.outerHTML.replace(/</g, '<') + >>> '</pre>')" in the IE address bar and clicing GO. >>> >>> I have no interest in (or drive space for) mass web page content >>> theft. >>> >>> But, is there anything in the .Net framework that will help with >>> viewing an encrypted web page's source for my spider? It seems I >>> need to be able to run Javascript to decode the page into readable >>> HTML.....but, as I may have said, I am only after the HTML....I >>> really don't want to DL the pics and all that other stuff - it kills >>> my speed. >>> >>> Any ideas? >>> Smerf,
Do you really think that we are supplying code here to hack peoples Email adresses to help spammers? It is a fool who supplies that. Cor Show quoteHide quote "smerf" <sm***@shroom.com> schreef in bericht news:zEaVg.41126$KR1.9960@bignews2.bellsouth.net... >I am trying to write a personal spider to crawl through websites and create >a highly specialized personal list of sites and pages that I may like to >see based on preferences that I have supplied. I have found some >interesting pages - interesting in the fact that they use javascript to >encrypt the pages to block people from ?stealing thier content?. > > There are javascript tricks that you can use on the downloaded encrypted > page to get around these irritations. You have to run a javascript line > in the browsers address line and you get another window with the > unencrypted HTML in it. But, I want to see the HTML unencrypted without > downloading every image, wav, activex object and flash thingy on the page > into an actual webbrowser control. Utilizing a webbrowser control for > this, and having to dl all images and such would dramatically decrease the > speed the spider can crawl at. > > An example of an encrypted page can be found at > http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way to > defeat it is by pasting > "javascript:window.open('about:blank').document.write('<pre>' + > document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" in > the IE address bar and clicing GO. > > I have no interest in (or drive space for) mass web page content theft. > > But, is there anything in the .Net framework that will help with viewing > an encrypted web page's source for my spider? It seems I need to be able > to run Javascript to decode the page into readable HTML.....but, as I may > have said, I am only after the HTML....I really don't want to DL the pics > and all that other stuff - it kills my speed. > > Any ideas? > And if I was looking to "hack", do you think I would have come here instead
of alt.hack or one of a thousand websites with black art experts? Get a life Cor. Show quoteHide quote "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> wrote in message news:ehjUsrP6GHA.2380@TK2MSFTNGP02.phx.gbl... > Smerf, > > Do you really think that we are supplying code here to hack peoples Email > adresses to help spammers? > > It is a fool who supplies that. > > Cor > > "smerf" <sm***@shroom.com> schreef in bericht > news:zEaVg.41126$KR1.9960@bignews2.bellsouth.net... >>I am trying to write a personal spider to crawl through websites and >>create a highly specialized personal list of sites and pages that I may >>like to see based on preferences that I have supplied. I have found some >>interesting pages - interesting in the fact that they use javascript to >>encrypt the pages to block people from ?stealing thier content?. >> >> There are javascript tricks that you can use on the downloaded encrypted >> page to get around these irritations. You have to run a javascript line >> in the browsers address line and you get another window with the >> unencrypted HTML in it. But, I want to see the HTML unencrypted without >> downloading every image, wav, activex object and flash thingy on the page >> into an actual webbrowser control. Utilizing a webbrowser control for >> this, and having to dl all images and such would dramatically decrease >> the speed the spider can crawl at. >> >> An example of an encrypted page can be found at >> http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way to >> defeat it is by pasting >> "javascript:window.open('about:blank').document.write('<pre>' + >> document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" in >> the IE address bar and clicing GO. >> >> I have no interest in (or drive space for) mass web page content theft. >> >> But, is there anything in the .Net framework that will help with viewing >> an encrypted web page's source for my spider? It seems I need to be able >> to run Javascript to decode the page into readable HTML.....but, as I may >> have said, I am only after the HTML....I really don't want to DL the pics >> and all that other stuff - it kills my speed. >> >> Any ideas? >> > > Smerf,
You take it very personally. If we supply that code it is free on Internet. One search on Google would make our hiding of emailadresses without sense. Why do you take it so personal, I thought that there was nothing personal in my reply. Cor Show quoteHide quote "smerf" <sm***@shroom.com> schreef in bericht news:RIoVg.34442$tT6.10036@bignews7.bellsouth.net... > And if I was looking to "hack", do you think I would have come here > instead of alt.hack or one of a thousand websites with black art experts? > > Get a life Cor. > > "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> wrote in message > news:ehjUsrP6GHA.2380@TK2MSFTNGP02.phx.gbl... >> Smerf, >> >> Do you really think that we are supplying code here to hack peoples Email >> adresses to help spammers? >> >> It is a fool who supplies that. >> >> Cor >> >> "smerf" <sm***@shroom.com> schreef in bericht >> news:zEaVg.41126$KR1.9960@bignews2.bellsouth.net... >>>I am trying to write a personal spider to crawl through websites and >>>create a highly specialized personal list of sites and pages that I may >>>like to see based on preferences that I have supplied. I have found some >>>interesting pages - interesting in the fact that they use javascript to >>>encrypt the pages to block people from ?stealing thier content?. >>> >>> There are javascript tricks that you can use on the downloaded encrypted >>> page to get around these irritations. You have to run a javascript line >>> in the browsers address line and you get another window with the >>> unencrypted HTML in it. But, I want to see the HTML unencrypted without >>> downloading every image, wav, activex object and flash thingy on the >>> page into an actual webbrowser control. Utilizing a webbrowser control >>> for this, and having to dl all images and such would dramatically >>> decrease the speed the spider can crawl at. >>> >>> An example of an encrypted page can be found at >>> http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way >>> to defeat it is by pasting >>> "javascript:window.open('about:blank').document.write('<pre>' + >>> document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" in >>> the IE address bar and clicing GO. >>> >>> I have no interest in (or drive space for) mass web page content theft. >>> >>> But, is there anything in the .Net framework that will help with viewing >>> an encrypted web page's source for my spider? It seems I need to be >>> able to run Javascript to decode the page into readable HTML.....but, as >>> I may have said, I am only after the HTML....I really don't want to DL >>> the pics and all that other stuff - it kills my speed. >>> >>> Any ideas? >>> >> >> > > "Do you really think that we are supplying code here to hack peoples Email
adresses to help spammers?" I was the one requesting the help. So, you basically accused me of being a spammer. You assumed the code would be used for crap like email collections - that we ALL (me included) HATE. (If I had my way, we could all kill spammers and hackers in the streets.) If you had simply meant that *others* may use it for such, you could have offered help via email or some other venue. (My guess is that you have no such code - which makes your accusation even more personal.) You shoud be more careful in how you reply to others' posts. I would never assume the worst of someone I didn't know all the facts. If I thought the info they were asking for was inappropriate, I would simply keep moving. I wouldn't reply at all. Maybe you should do the same. And, FYI, there are ways to obfuscate only your email addresses, in web pages, that help to hide them from bots. Personally, I don't even post my email address on my sites in the HTML. It is on the page as a human-readable image. If someone wants to email me, they have to type in my email address or use the reply form on the site that sends me an email from server side script - also effectively hiding my email address. Now, move along. I have work to do. Show quoteHide quote "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> wrote in message news:eAAkDVW6GHA.2380@TK2MSFTNGP02.phx.gbl... > Smerf, > > You take it very personally. If we supply that code it is free on > Internet. One search on Google would make our hiding of emailadresses > without sense. > > Why do you take it so personal, I thought that there was nothing personal > in my reply. > > Cor > > "smerf" <sm***@shroom.com> schreef in bericht > news:RIoVg.34442$tT6.10036@bignews7.bellsouth.net... >> And if I was looking to "hack", do you think I would have come here >> instead of alt.hack or one of a thousand websites with black art experts? >> >> Get a life Cor. >> >> "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> wrote in message >> news:ehjUsrP6GHA.2380@TK2MSFTNGP02.phx.gbl... >>> Smerf, >>> >>> Do you really think that we are supplying code here to hack peoples >>> Email adresses to help spammers? >>> >>> It is a fool who supplies that. >>> >>> Cor >>> >>> "smerf" <sm***@shroom.com> schreef in bericht >>> news:zEaVg.41126$KR1.9960@bignews2.bellsouth.net... >>>>I am trying to write a personal spider to crawl through websites and >>>>create a highly specialized personal list of sites and pages that I may >>>>like to see based on preferences that I have supplied. I have found >>>>some interesting pages - interesting in the fact that they use >>>>javascript to encrypt the pages to block people from ?stealing thier >>>>content?. >>>> >>>> There are javascript tricks that you can use on the downloaded >>>> encrypted page to get around these irritations. You have to run a >>>> javascript line in the browsers address line and you get another window >>>> with the unencrypted HTML in it. But, I want to see the HTML >>>> unencrypted without downloading every image, wav, activex object and >>>> flash thingy on the page into an actual webbrowser control. Utilizing >>>> a webbrowser control for this, and having to dl all images and such >>>> would dramatically decrease the speed the spider can crawl at. >>>> >>>> An example of an encrypted page can be found at >>>> http://www.aw-soft.com/htmlguard-sample.html. A simple Javascript way >>>> to defeat it is by pasting >>>> "javascript:window.open('about:blank').document.write('<pre>' + >>>> document.documentElement.outerHTML.replace(/</g, '<') + '</pre>')" >>>> in the IE address bar and clicing GO. >>>> >>>> I have no interest in (or drive space for) mass web page content theft. >>>> >>>> But, is there anything in the .Net framework that will help with >>>> viewing an encrypted web page's source for my spider? It seems I need >>>> to be able to run Javascript to decode the page into readable >>>> HTML.....but, as I may have said, I am only after the HTML....I really >>>> don't want to DL the pics and all that other stuff - it kills my speed. >>>> >>>> Any ideas? >>>> >>> >>> >> >> > >
Thread, UI update is this ok?
Redim Multidimenional Arrays MS Word Chart filled with data - URGENT PLEASE VB6 --> VB 2005 upgrade ...IDE now VERY slow!!! Importing DLLs during runtime PrintPreviewDialog Control and HTML Can a custom object know it's owner Drag Bitmap tutorial/code anywhere? Can I create controls with static ID's? Encoding question |
|||||||||||||||||||||||