Membership is FREE, giving all registered users unlimited access to every DNForum feature, resource, and tool! Optional membership upgrades unlock exclusive benefits like profile signatures with links, banner placements, appearances in the weekly newsletter, and much more - customized to your membership level!

How to filter domains from other text?

Status
Not open for further replies.

NameMatters

Level 8
Legacy Exclusive Member
Joined
Oct 25, 2005
Messages
1,954
Reaction score
5
I have lot of stats pages which includes all the numbers etc..How do I just grab the domain names from the text running thousands of lines? I know there is some tool to do that but can't remember which one :)


Example:--------------------------------------------------

1051fm.com41521647704272592763470831320310880502942203656591247115050625263099150440803020517303801190120512505296000184107690672836028623393173111463
Am730.com1945232231502255752705107443186350090564831860150691673949080217279945827190851911403122033447378201951708212039455505227003145094834105086080751
985fm.com487353584902987933091812085200323764205042013119029747605959392351993059294944166469616351548663196625071330271063464034633490659063234176658913563224198241841155
Fm949.com5090959240236918096844741007688801800207413185
-----------------------------------------------------------------------
I just need to list all the domains out of such text running into thousands of lines..


Any help is appreciated.

thanks,
Sai.
 

AlienGG

Level 8
Legacy Gold Member
Joined
Nov 10, 2006
Messages
1,347
Reaction score
0
copy and paste to MS Word
find and replace - ctrl+h
find: ^#^#^#^#^#^#^#^#^# <==change number of ^# if the actural domain name on the list fall into such length.
replace: *space*
replace all
keep doing that until it dies
then find: *space*^#
replace: *null*
replace all
keep doing it until it dies

*space* = a space
*null* = nothing
^# = wildcard for any number on Word
 

Dale Hubbard

Formerly 'aZooZa'
Legacy Exclusive Member
Joined
Jan 24, 2003
Messages
5,578
Reaction score
91
Do any of the domain names start with numbers?

Does each line START with the domain name?

Also, is all the garbage you don't want of fixed length?
 

NameMatters

Level 8
Legacy Exclusive Member
Joined
Oct 25, 2005
Messages
1,954
Reaction score
5
Wow, that was quick responses guys! :)

I will try the options. Yes, some names start with numbers.

thanks again,
 

Dale Hubbard

Formerly 'aZooZa'
Legacy Exclusive Member
Joined
Jan 24, 2003
Messages
5,578
Reaction score
91
It all depends on the size of his input data as to what will and will not work. Anything that uses 'copy and paste' will fall over with a large data set in Windows. If all the domains are at the start of lines then the job is dead easy even if there are millions of lines.
 

NameMatters

Level 8
Legacy Exclusive Member
Joined
Oct 25, 2005
Messages
1,954
Reaction score
5
Dale, thanks again. Yes , all the domains are at start of the page.
 

Dale Hubbard

Formerly 'aZooZa'
Legacy Exclusive Member
Joined
Jan 24, 2003
Messages
5,578
Reaction score
91
Why don't you chuck the file up as a .zip on one of these free uploading sites, then I'll get it, strip out the rubbish and send you the file.
 

stuff

Mr Domeen
Legacy Exclusive Member
Joined
Mar 30, 2002
Messages
4,356
Reaction score
35
I do it so, first I use list cleanup and then I use this program called
list detective by "local whois" - very helpfull tool. it supports .com, .net, .org its very old program, and I don`t know where i got it, but it works
 

A D

Level 14
Legacy Exclusive Member
Joined
Feb 20, 2003
Messages
15,040
Reaction score
1,188
I just needed a tool like this a minute ago and forgot it existed until I googled something similar and it popped up.

That was cool!

-=DCG=-
 
Status
Not open for further replies.

Who has viewed this thread (Total: 1) View details

Who has watched this thread (Total: 2) View details

The Rule #1

Do not insult any other member. Be polite and do business. Thank you!

Members Online

Premium Members

Upcoming events

Our Mods' Businesses

*the exceptional businesses of our esteemed moderators

Top Bottom