Monday 25 January 2010

Plagiarism and the Web Revolution

Professors Against Plagiarism:

Plagiarism and the Web Revolution

Web Plagiarism has become an epidemic in academia largely as a result of the high precision and recall of the Google search engine and the huge volumes of intellectual property on the web.

According to the Northern Kentucky University (library.nku.edu) many students think that it is acceptable to “paraphrase” the works of others and they have one of the best definitions that I have seen: 

Students anxious about committing plagiarism often ask:

 "How much do I have to change a sentence to be sure I'm not plagiarizing?" A simple answer to this is: If you have to ask, you're probably plagiarizing. This is important.

 Avoiding plagiarism is not an exercise in inventive paraphrasing. There is no magic number of words that you can add or change to make a passage your own.

 Original work demands original thought and organization of thoughts.

As a retired Adjunct Professor Emeritus who makes my living selling my words I find plagiarism especially offensive.  Plagiarism, by its very definition cannot be an accident, and it is an intentional act of theft.  No amount of excuses or pleas of ignorance can exonerate a plagiarist from their fundamental dishonesty.

The Semantics of Plagiarism

As we have noted, the detection of plagiarism involves the stealing of “original work and thought”.  Plagiarism can be subtle, and many students believe that they can change one or two words in a sentence and avoid detection.  So, how do we detect the work of the sly plagiarist who replaces words with synonyms and alters the sentence structure?

·        Synonyms – A reference to “House” could be changed to dwelling, abode, apartment, etc.

·        Word Stems – A reference to “house” could be changed to housing, home, etc.

·        Semantic Structure – Adverbs and adjectives can be replaced and altered to conceal the crime.

Fortunately, despite these attempts to hide their wrongdoing, the plagiarist is still detected thanks to sophisticated web tools and the world of applied Artificial Intelligence.  Sophisticated software such as those found at Turnitin.com employ pattern matching algorithms that glean the “meaning of each phrase” and compare it to existing works on the Internet.  

Let’s take a closer look at how this works.  Software such as the Princeton Wordnet provides hierarchies of synonyms that can replicate the plagiarist’s attempts to conceal their theft.  This author worked extensive with semantic networks and they can often lead to surprising results.  Once, I entered a semantic search against a major legal database to see of any published court ruling had ever used the profane “F” word.

I ran the search using full synonym expansion and was surprised to find dozens of results, each with the highlighted word “Congress”.  Confused, I consulted the semantic network and discovered that a “Congress” like the “F” word, is a union of two bodies!

Other web search engine companies are developing search tools that have the surprising side-effect of being able to detect plagiarism.  Their goal is to allow web users to highlight a paragraph of text and press a button called “Show me more like this”.

Internally, these tools analyze the paragraph, apply structure, word stem and synonym rules, and scour the web for a suitable match.

The epidemic of web content Theft

According to a televised investigation report on the hit TV show Primetime Thursday, they found a growing problem of cheating and plagiarism, facilitated by the massive volumes of content on the web.  From junior high schools to the Ivy League, Primetime found that students find the temptation to cit-and-paste from the web an irresistible temptation.  According to the Primetime Thursday report, many students believe that “everyone” plagiarizes, and they use this as an excuse for their theft:

"It's unfair on your part, if you're studying, you know, so many hours for an exam and everybody else in the class gets an 'A' cheating," says Sharon, a college student.

"So you want to get in the game and cheat, too."

The web is a double-edged sword.  Just as it has facilitated the theft of content, it has also enabled tools for publishers and professors to quickly detect stolen content.  Let’s take a closer look.

Detecting Plagiarism

Fortunately, it’s just as easy for someone to detect plagiarism as it is for the scumbags to copy it off of other people’s web pages.  There are several web sites that aid in detecting plagiarism.

·        Amazon – The Amazon “search inside the book” feature has resulted in dozens of lawsuits for plagiarism as unscrupulous authors were caught within days of the introduction of the feature.

·        Google – The Google search engine is used by almost all College professors today and the new Google Print facility is now indexing thousands of books into the Google engine.

·        Turnitin.com – This wonderful web site is available to academics everywhere and provides instant web content matching for papers and College essays. (www.turnitin.com)

Now that the web has given us tools to detect the plagiarist, the threat of getting caught has acted as a deterrent.  However, the punishments for the plagiarist can run the gamut from a slap-on-the hand to loss of

 

Professors Against Plagiarism:

A Question of Honor

Since I make my living selling my work, I have an intense hatred for those who steal the work of others.  When I was a professor at a major state university I would always make sure that all of my students understood the difference between “fair use”, author attribution, and the seriousness of stealing the works of others and calling them your own.

The punishments for plagiarists are the most severe at schools that employ and enforce an honor code such as the U.S. military academies.

“We will not Lie, Cheat, Steal, nor Tolerate Among us those Who Do”

Please note that the honor code requires any student to turn-in any other student who they suspect of lying, cheating or stealing.  This created a self-policing system to ensure personal honor and integrity.

John Garmany, a noted author with Rampant TechPress and a Graduate of West Point notes that the honor code made plagiarism virtually non-existent:

 “We were well versed in plagiarism and we would never think of using someone else’s work without giving them credit.  

An honor code violation meant dishonor and dismissal from West Point and we took it very seriously.  

For example, we were allowed to ask another cadet for help, but we were required to mention the helper by name, even if we did not use any of their ideas directly”

Sadly, enforcement of web content theft is sporadic at best, even among the top schools.

Punishment for Plagiarists

In my experience as an Adjunct professor, plagiarism is largely tolerated at major U.S. colleges and universities, and I found it to be extremely frustrating.

I remember one case where a U.S. Military officer submitted a computer program that matched the work of another, line for line.  Upon investigation I discovered that he had lifted someone else’s work from a trash bin and copied it, adding only his name, as the author.

I was especially offended because this officer was a graduate of a U.S. Military academy, and was completely familiar with the honor code and the ethics of an officer and a gentleman.  Upon confrontation, he was completely unremorseful and gave the lame excuse “everyone does it”. 

In this case I wanted to show no mercy, and I attempted to flunk the student and file a complaint against him with the university.  I was fully aware that the Armed Forces would not be favorable to him, that he would loose his security clearance and could be summarily dismissed from the armed forces, perhaps loosing his retirement and most of all, his personal honor.

Unfortunately, the Dean of my College was far more tolerant than I was, and refused to allow me to pursue my complaint.  To me, it seems that no threat of consequences enables the web thief.

Plagiarism, intentional or not, is considered stealing and can expose you to serious liability.  In 2004 I was reviewing a job interview book and discovered an entire page that I had written which had been stolen and published by one of the world’s largest publishers.  Fortunately for the publisher, I was also one of their authors, and I was familiar with their contract that holds the author solely responsible for ensuring that their content is their own work.

Those who are victims of plagiarism are entitled to the following remedies:

·        To have the offending book recalled from distribution – This can cost the publisher over $100k, and the author was required to pay for it.

·        To an official published apology – The plagiarist must publicly admit their theft and acknowledge the rightful creator of the material.

·        Civil damages – In one case, the victim sued the author and received over a quarter of a million dollars.  The author lost his house, savings and was ruined by their act.

Laws against Plagiarism

According to the United State Constitution, “The Congress shall have power to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive rights to their respective writings and discoveries”.  The U.S. Supreme court has also addressed the plagiarism issue, and also uses the “Latham Act” to justify punitive damages for plagiarism. 

In 1948, Doubleday copyrighted and published General Dwight D. Eisenhower’s book, Crusade in Europe, which was about the D-Day invasion and Fox later created a TV series from it.

For the fiftieth anniversary of World War II, a third party company named Dastar edited the Crusade in Europe television series, added some new material, and released a video set called World War II Campaign in Europe without attribution to Fox.

In the famous Dastar vs. Twentieth Century Fox case (539 US 23), the court found Dastar guilty of plagiarism for copying Twentieth Century Fox material without giving them proper credit:

their complaint […] claims that Dastar's sale of Campaigns “without proper credit” to the Crusade television series constitutes “reverse passing off” in violation of § 43(a) of the Lanham Act, 15 U.S.C. § 1125(a)

In this case we see that the U.S. Supreme court doubled the amount of the damages.  When plagiarism is intentional and with malice, courts are allowed to impose “punitive” damages, doubling and even tripling the amount of the actual damage to punish the plagiarist: 

Professors Against Plagiarism:

SOURCE: http://professorsagainstplagiarism.blogspot.com/

 

Bogus Conferences

Looking over the Internet we discovered this story:

http://diehimmelistschoen.blogspot.com

Really very interesting

So, the Universities that organize WSEAS Conferences must be very careful in the review

After this mistake of IEEE, we must be careful

Also, we found via Wikipedia this:

http://en.wikipedia.org/wiki/SCIgen
which is a collection for many bogus conferences (outside the WSEAS).

WSEAS is very proud that we have a very strict review process.
So, I copy from WIKIPEDIA this TEXT

In 2008 and 2009, several computer generated (gibberish) conference articles, with fictitious authors, appeared in IEEE Xplore Data Base coming from many IEEE Sponsored events. Other poor quality conference articles have also, occasionally, appeared in IEEE Confererences and consequently in IEEE Xplore. The IEEE itself accepted (see http://www.ieee.org/web/aboutus/corporate/board/ad_hoc_committees/qualityofconferencepapers.html )that such articles hurt the reputation of IEEE and destroyed confidence in the quality of IEEE publications. IEEE tried to find solutions against this vulnerability but in vain, because many more bogus papers appeared in the next months (see http://iaria-highsci.blogspot.com and http://blog.marcelotoledo.org/2008/12/26/how-can-someone-trust-ieee )


List of works with noticeable acceptance

  • Rob Thomas: Rooter: A Methodology for the Typical Unification of Access Points and Redundancy, 2005 for WMSCI (see above)
  • Mathias Uslar's paper was accepted to the IPSI-BG conference[4].
  • Professor Genco Gülan published a paper in the 3rd International Symposium of Interactive Media Design[5].
  • Students at Iran's Sharif University of Technology published a paper in the Journal of Applied Mathematics and Computation (which is published by Elsevier)[6]. The students wrote under the false, non-Persian surname, MosallahNejad, which translates literally as: "from an Armed Breed". The paper was subsequently removed when the publishers were informed that it was a joke paper[7].
  • Conferences of Wessex Institute of Technology [8].
  • It seems also that the IEEE IARIA Conference accepted another bogus paper: http://iaria-highsci.blogspot.com/2008/12/we-have-letter-of-acceptance-fantastic.html
  • A paper titled "Towards the Simulation of E-Commerce" by Herbert Schlangemann got accepted as a reviewed paper at the "International Conference on Computer Science and Software Engineering" (CSSE) and was briefly in the IEEE Xplore Database [9]. The author is named after the Swedish short film Der Schlangemann. Furthermore the author was invited to be a session chair during the conference[10].Read the official Herbert Schlangemann Blog for details[11]. The official review comment: "This paper presents cooperative technology and classical Communication. In conclusion, the result shows that though the much-touted amphibious algorithm for the refinement of randomized algorithms is impossible, the well-known client-server algorithm for the analysis of voice-over- IP by Kumar and Raman runs in _(n) time. The authors can clearly identify important features of visualization of DHTs and analyze them insightfully. It is recommended that the authors should develop ideas more cogently, organizes them more logically, and connects them with clear transitions"
  • In 2009, the same incident happened and Herbert Schlangemann's latest fake paper "PlusPug: A Methodology for the Improvement of Local-Area Networks" has been accepted for oral presentation at another international computer science conference [12]. Recently, Denis Baggi, Chairman, IEEE CS confessed, according to a comment on the Schlangemann Blog, that "Selection criteria such a refereeing etc. are meaningless", probably means that IEEE has accepted the unreliability and bogosity of its conferences. Denis Baggi also adds: "Articles should be written only if someone has something to tell others, in which case the validity of the paper is obvious",

A letter from Evan M. Butterfield (IEEE)

A letter from Evan M. Butterfield (Director of Products & Services, IEEE Computer Society10662 Los Vaqueros Circle, Los Alamitos, CA 90720714.816.2165) informed in Jan 17, 2009 the following:

The IEEE Computer Society (CS) has evidence that multiple (IEEE) conferences are receiving machine-generated papers. In two cases, conferences have actually accepted an obviously fraudulent submission. This is a serious issue that threatens the credibility of your conference, the quality of the digital library, and the reputation of both the IEEE and CS. It requires your immediate attention. Please take this opportunity to ensure that your peer review processes are being followed, and adapt to any new requirements that may be communicated by the IEEE or the Computer Society. No conference published by CPS should rely on an abstract review. It is very important that you review carefully the full text of all papers submitted to your conference. If you have already accepted papers, your program committee should review the full text again. While CPS staff will be conducting random spot-checks of conference papers in the publishing queue, we are relying on you to authenticate the content of your proceedings. Any papers that were not actually presented at your conference need to be brought to our attention, and should receive close review. In known cases, the machine-generated origin is obvious from a reading of the first few paragraphs of the paper; the abstracts are human-generated and do not indicate the quality of the paper itself. In the past, papers have been submitted by “Herbert Schlangemann,” but be mindful that the perpetrator of this fraud will change the approach over time. In the event you discover any evidence of questionable content or behavior, please communicate that to us immediately along with an action plan for addressing the problem. Thank you for your help in maintaining the quality of our products. See: http://bogusconferences.blogspot.com/2009/05/bogus-conferences-ieee-confess.html


Criticism concerning publishers

Recently, many fake papers appeared in several IEEE conferences, because the IEEE grants its name and its logo to many local organizers who supposedly do not conduct a thorough review process. It is being argued that such conferences only exist to make money out of researchers that are looking for a simple way to publish their work, in particular publishers like IARIA, http://www.iaria.org, HIGHSCI http://www.highsci.org and SRP http://www.scirp.org appear questionable. As seen from their web sites, IARIA, HIGHSCI and SRP use the name of IEEE and the IEEE publishing services, thus attracting numerous papers. Some people to test some conference go further and sent the paper "A Statistical Method For Women That Can Help Our Sexual Education" in the IEEE Conference organized by IARIA. This paper received automatic acceptance within a few hours with simultaneous "command" of direct payment. Unfortunately this paper was not published because the authors did not pay the registration fee. However the letter of acceptance is published on the web and anybody can check it: http://iaria-highsci.blogspot.com/, http://scamieee.blogspot.com/

Other protest blogs are:

  • Official Protests[13].
  • Bogus Conferences [14].
  • "Netdriver"[15].
  • "Another Letter of acceptance in an IEEE Conference"[16].
  • Anti-Plagiarism Web Log[17].
  • "How can someone trust IEEE?"[18].
  • "Open Letter"[19].
  • "A letter from Evan M. Butterfield (IEEE) "[20].

See also

In September 2008 the Journal for Scientific Publications of Aspirants and Doctoral Candidates published machine translation (with some human intervention) of Rooter into Russian, undersigned by a certain "Mikhail Zhukov" (a feigned name, used by journalists from Troitsky variant newspaper, who wanted to demonstrate low quality of scientific publications and peer review process in Russia). Rooter got good-to-excellent comments from the peer, praising high practical applicability of the matter researched and the novelty of the material; the only negative comment was given in regard of the style, which was claimed to be more appropriate for a newspaper than for a scientific journal. After the "author" corrected stylistic drawbacks, the article was accepted for publication.

Following the publication and consequent scandal, the presidium of the Higher Attestation Commission of Russia struck the Journal for Scientific Publications of Aspirants and Doctoral Candidates from the official list of journals authorized to publish research materials of aspirants and doctoral candidates.

Notes

  1. ^ Stribling, Jeremy; Aguayo, Daniel; Krohn, Maxwell. "Rooter: A Methodology for the Typical Unification of Access Points and Redundancy" (PDF). http://pdos.csail.mit.edu/scigen/rooter.pdf.
  2. ^ Rob Thomas. "The Dangers of Spamferences" (HTML). http://thepeerreview.ca/view.php?aid=221.
  3. ^ "SCIgen - An Automatic CS Paper Generator". MIT. http://pdos.csail.mit.edu/scigen/.
  4. ^ "Mathias Uslar's paper.". http://www.mwise.de/blog/index.php/2005/12/29/scigen-for-scientific-research-a-case-study/.
  5. ^ "About Genco Gulan's paper.". http://pdos.csail.mit.edu/scigen/blog/index.php?entry=entry060414-130910.
  6. ^ Rohollah Mosallahnezhad. "Cooperative, Compact Algorithms for Randomized Algorithms" (PDF). http://ce.sharif.edu/~ghodsi/soft-group/misc/AMC-paper.pdf.
  7. ^ John L. Casti. "REMOVED: Cooperative, compact algorithms for randomized algorithms". http://dx.doi.org/10.1016/j.amc.2007.03.011.
  8. ^ "Conferences of Wessex Institute of Technology without review". http://www.cg.tuwien.ac.at/~wp/videa.html.
  9. ^ "Paper on the IEEE Database". http://ieeexplore.ieee.org/search/freesrchabstract.jsp?arnumber=4723109&k2dockey=4723109@ieeecnfs.
  10. ^ "CSSE Conference Program". https://sites.google.com/site/herbertschlangemann/Home/csse2008_program.pdf?attredirects=0.
  11. ^ "Schlangemann's blog". http://diehimmelistschoen.blogspot.com/.
  12. ^ [http://www.ieee-ecommerce.com/ "IEEE International Conference on e-Business and Information System Security"]. http://www.ieee-ecommerce.com/.
  13. ^ "Some other conferences of IEEE". http://anti-ieee.blogspot.com/2008/02/iti-2008.html.
  14. ^ "Bogus Conferences". http://bogus-conferences.blogspot.com/.
  15. ^ "Conferences that you must avoid". http://netdriver.blogspot.com/2009/01/from-site-httpwwwanti-plagiarismorg.html.
  16. ^ "Another Letter of acceptance in an IEEE Conference". http://iaria-highsci.blogspot.com/2008/12/we-have-letter-of-acceptance-fantastic.html.
  17. ^ "Other IEEE Conferences". http://netdriver.blogspot.com/2009/01/this-is-shame-for-ieee.html.
  18. ^ "How can someone trust IEEE". http://blog.marcelotoledo.org/2008/12/26/how-can-someone-trust-ieee.
  19. ^ "Open Letter". http://dominore.blogspot.com/2009/05/i-am-freelance-journalist-who.html.
  20. ^ "A letter from Evan M. Butterfield (IEEE)". http://bogusconferences.blogspot.com/2009/05/bogus-conferences-ieee-confess.html.