Deduplication of News Stories

From LexisNexis Academic Knowledge Center
(Difference between revisions)
Jump to: navigation, search
(New page: ==Deduplicate News Search Results in LexisNexis Academic== LexisNexis Academic now includes a deduplication setting in the results list of news searches. This enhancement to the product a...)
 
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Deduplicate News Search Results in LexisNexis Academic==
 
==Deduplicate News Search Results in LexisNexis Academic==
 +
Sometimes the same news story is repeated, with little or no difference, in multiple newspapers or the news aggregator sources. The Deduplicate feature allows you to remove these duplicates from your result set.
  
LexisNexis Academic now includes a deduplication setting in the results list of news searches. This enhancement to the product allows you to take out duplicate news stories that appear in the results set through aggregators that include duplicate titles. For example, if you're searching "All News, English", which contains the New York Times and also "Global NewsWire", which contains the New York Times, you may have two duplicate articles show up in your results set. The deduplication feature will cut all of these down and provide you with a clear, concise list.
+
==How It Works==
 
+
When you run an Easy/News search or use one of the special News search forms available on the navigation menu, your results will include the deduplication tool, pictured below. By default, deduplication is set to "off" so that all search results will be displayed. Changing this setting will remove duplicate stories from your result set.<br>
 
+
[[Image: deduplication11.jpg]]
===Where Will the Box Appear?===
+
 
+
The Deduplcation box, pictured below, will appear after you click "Search" on a news search form.<br>
+
[[Image: deduplication1.jpg]]
+
 
<br>
 
<br>
 +
==High Similarity vs. Moderate Similarity==
 +
You can choose to turn deduplication on to two different degrees: High similarity and Moderate similarity. "High Similarity" means that the system will require a high degree of similarity before it considers documents to be duplicates. "Moderate Similarity" means that the system will tolerate a greater difference between two documents and still consider them duplicates.
  
===Which Searh Forms Will Bring Back This Option?===
+
==Working with Deduplicated Results==
The Search the News widget on the Easy Search form and all forms under the "News" section will offer the option to deduplicate the results set.
+
 
+
===What Do the Deduplicated Results Look Like?===
+
 
Here's a news results set with Deduplication set to "Off":<br>
 
Here's a news results set with Deduplication set to "Off":<br>
[[Image: Deduplication2.jpg|center|650px]]<br>
+
[[Image: Deduplication22.jpg|center|650px]]<br>
 
Here's a news results set with Deduplication set to "On: High":<br>
 
Here's a news results set with Deduplication set to "On: High":<br>
[[Image: Deduplication33.jpg|center|650px]]
+
[[Image: Deduplication333.jpg|center|650px]]
 
<br>
 
<br>
Notice that in the first image, item 13 and 14 are duplicate results. However, in the second image, the list has been deduplicated. You can also see that you have the option to view the duplicate articles.
+
Notice that in the first image, item 5 and 6 are duplicate results. However, in the second image, the list has been deduplicated. You can also see that you have the option to view the duplicate articles.
 +
 
  
 
[[Category: Academic Help]]
 
[[Category: Academic Help]]
 
[[Category: FAQ]]
 
[[Category: FAQ]]

Latest revision as of 10:26, 24 February 2011

Contents

[edit] Deduplicate News Search Results in LexisNexis Academic

Sometimes the same news story is repeated, with little or no difference, in multiple newspapers or the news aggregator sources. The Deduplicate feature allows you to remove these duplicates from your result set.

[edit] How It Works

When you run an Easy/News search or use one of the special News search forms available on the navigation menu, your results will include the deduplication tool, pictured below. By default, deduplication is set to "off" so that all search results will be displayed. Changing this setting will remove duplicate stories from your result set.
Deduplication11.jpg

[edit] High Similarity vs. Moderate Similarity

You can choose to turn deduplication on to two different degrees: High similarity and Moderate similarity. "High Similarity" means that the system will require a high degree of similarity before it considers documents to be duplicates. "Moderate Similarity" means that the system will tolerate a greater difference between two documents and still consider them duplicates.

[edit] Working with Deduplicated Results

Here's a news results set with Deduplication set to "Off":

Deduplication22.jpg

Here's a news results set with Deduplication set to "On: High":

Deduplication333.jpg


Notice that in the first image, item 5 and 6 are duplicate results. However, in the second image, the list has been deduplicated. You can also see that you have the option to view the duplicate articles.