find duplicate entry in a list in bash with sed

2010-06-25 1 min read bash Fedora Learning

Here I will take an example of rss2email list, but I guess I will be able to pass on the concept.

Here is example of the output of the r2e list command:

1: http://blog.amit-agarwal.co.in/feed (default: amitag@localhost)
2: http://feeds2.feedburner.com/AllAboutLinux (default: amitag@localhost)
3: http://feeds2.feedburner.com/Command-line-fu (default: amitag@localhost)
4: http://blogs.members.freewebs.com/Members/Blogs/viewBlogRSS.jsp?userid=29731143 (default: amitag@localhost)

Target here is to get the list of all duplicate entries if any. So, first we need to remove the numbers from the begining and the email ID from the end.

We will use sed to remove the email and the numbers. Heres what we can use for doing this

sed ’s/^[0-9]*: //’

and

sed ’s/ (.*//’

So, let’s try now with

r2e list |sed ’s/^[0-9]*: //’ |sed ’s/ (.*//’

If you see just the lines we are interested in then it is time to use the <span style="color: #ff00ff;">uniq command.

r2e list |sed &#8217;s/^[0-9]*: //&#8217; |sed &#8217;s/ (.*//&#8217; |uniq -d<h6 class="zemanta-related-title">Related articles by Zemanta <ul class="zemanta-article-ul"> <li class="zemanta-article-ul-li"><a href="http://codebetter.com/blogs/david.hayden/archive/2010/03/20/iis-7-url-rewriter-for-seo-friendly-url-s.aspx">IIS 7 URL Rewriter for SEO Friendly URL&#8217;s (codebetter.com) <li class="zemanta-article-ul-li"><a href="http://www.themoxiemomblog.com/wordpress/install-wordpress-windows-localhost">How to Install WordPress in Windows Localhost with XAMPP (themoxiemomblog.com) <div class="zemanta-pixie"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/b4a0cfe1-d4d9-4422-88e3-9871d782935f/"><img class="zemanta-pixie-img" src="http://blog.amit-agarwal.co.in/wp-content/uploads/2010/08/reblog_b26.png" alt="Reblog this post [with Zemanta]" /><span class="zem-script more-related more-info pretty-attribution paragraph-reblog">

comments powered by Disqus