Get AI summaries of any video or article — Sign up free
Drillbit Plagiarism Checker: Filters and exclusions || Hindi || 2024 thumbnail

Drillbit Plagiarism Checker: Filters and exclusions || Hindi || 2024

5 min read

Based on eSupport for Research's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Create a project/folder first; filter/exclusion settings are applied during folder creation and persist for later uploads.

Briefing

The core takeaway is that a “plagiarism checker” workflow can be made compliant and more accurate by applying targeted exclusions—specifically for quoted material, references/bibliography, generic terms, and common knowledge—then generating a similarity report that stays under a required threshold (often “below 10”). The process matters because universities and UGC-style rules typically treat similarity differently from plagiarism: similarity can include legitimate overlap, so filtering out non-substantive sections helps committees judge the real risk.

After aligning with UGC and university regulations (which may vary slightly by institution), the workflow focuses on what can be excluded. Exclusions can include quoted content, reference lists/bibliography, table of contents, acknowledgements, and generic symbols or generic terms. The transcript also stresses that “common knowledge” and repeated technical phrases may be excluded only after checking with the supervisor/guide and the relevant committee—so the exclusions are defensible when the report is shared.

Practically, the checker is used by logging in, creating a folder (project) and then uploading a document for analysis. During folder creation, filter options become available and persist for that project. The example walkthrough uses a demo project and a sample paper, then demonstrates enabling specific exclusions such as “exclude bibliography.” It also shows how adding exclusion phrases (e.g., recurring terms like ECG, or database names/phrases inserted in the text) affects what gets filtered.

The tool then asks which sources to compare against, grouped into categories such as general public publishing, publisher/internet sources, and web repositories (with multiple source categories selectable). Similarity is reported as a similarity word/index, with a stated requirement to keep it under 10. The transcript draws a distinction between “plagiarism” and “similarity index,” noting that they are not identical and that the threshold is a policy requirement rather than a direct measure of wrongdoing.

Once the report is generated, the user can inspect match sources and—crucially—verify that excluded items appear as excluded in the report. The settings section shows whether exclusions like code/quotes, bibliography, and similarity-word filtering were enabled. There’s also a “duplicate phrases” control: the user can set a range (for example, selecting from 3 up to 14 words) so repeated phrases can be excluded, which is useful when a term or phrase keeps pushing similarity upward.

The transcript warns that exclusions must be applied correctly at the project/folder level. If the same file is re-submitted without updating the folder’s exclusion settings, the filtered phrases may not be removed in the next run. After generating the report, the user should include only the relevant pages/sections when submitting to the committee, and the tool can provide a plagiarism certificate printout with options to select the document type (thesis/dissertation/article). The overall message: keep similarity below the required limit by using guideline-approved filters, but always coordinate exclusions with the guide and ensure the report transparently reflects what was excluded.

Cornell Notes

The workflow centers on reducing similarity scores legitimately by applying exclusions for non-substantive text—quoted material, references/bibliography, generic terms/symbols, and common-knowledge phrases—then generating a similarity report that meets an institutional threshold (often “below 10”). Exclusions are configured at the project/folder level during creation, and they persist for later uploads. After uploading a paper, the report shows match sources and also lists what was excluded (excluded sources, excluded bibliography, excluded phrases), making the filtering auditable. A “duplicate phrases” range setting helps remove repeated technical phrases that inflate similarity. Because rules vary by university and UGC-style guidelines, the transcript repeatedly emphasizes consulting the guide/committee before excluding anything.

Which parts of a research document are typically eligible for exclusion, and why does that help?

Eligible exclusions include quoted material, references/bibliography, table of contents, acknowledgements, and generic terms/symbols. The goal is to remove overlap that is not meaningful authorship duplication—so the similarity score reflects substantive text rather than properly cited or standardized sections.

Where do exclusions get configured, and what goes wrong if settings aren’t updated?

Exclusions are set during project/folder creation (the filter options appear while creating the folder) and remain available for that project. If a user adds exclusion phrases but then submits without updating the folder settings (or reusing the same folder correctly), the next similarity run may still count those phrases, leaving the similarity score unchanged.

How does the tool handle similarity thresholds versus plagiarism?

The transcript distinguishes “similarity index/word” from plagiarism. Similarity is treated as a measurable overlap score that must be kept under a policy threshold (e.g., below 10), but it doesn’t automatically mean plagiarism. Committees still decide based on context and what was excluded.

What is the purpose of excluding repeated phrases, and how is it controlled?

Repeated phrases (like recurring technical terms or specific database/phrase strings) can inflate similarity. The tool provides a “duplicate phrases” control where the user selects a range (example given: from 3 up to 14 words) so phrases within that repetition pattern can be excluded, reducing similarity without removing substantive content.

How can a user verify that exclusions actually applied in the final report?

After report generation, the user checks the report’s settings and excluded sections. The report indicates excluded items such as excluded bibliography, excluded sources, and excluded phrases, and it shows match sources so the user can confirm that the filtered content is marked as excluded rather than silently removed.

What submission-related outputs are available after analysis?

The workflow includes generating a similarity report and a plagiarism certificate/print option. It also supports selecting the document type (thesis/dissertation/article) and allows adding required signature/verification fields (e.g., guide signature, head of department, librarian/director roles) before saving/printing the PDF.

Review Questions

  1. When should a user consult the guide/committee before applying exclusions, and what kinds of text are most likely to be considered “common knowledge” or non-substantive?
  2. Why is it important that exclusions are configured at the project/folder level rather than only during a single submission run?
  3. How does the “duplicate phrases” range setting influence which repeated phrases get excluded, and how would you troubleshoot if similarity stays above 10?

Key Points

  1. 1

    Create a project/folder first; filter/exclusion settings are applied during folder creation and persist for later uploads.

  2. 2

    Use guideline-approved exclusions for quoted material, references/bibliography, table of contents, acknowledgements, and generic terms/symbols.

  3. 3

    Coordinate any exclusion of common-knowledge phrases or repeated technical strings with the supervisor/guide and committee before final submission.

  4. 4

    Select the appropriate comparison sources (internet/web/publisher/repository categories) so the similarity report reflects the intended matching scope.

  5. 5

    Aim to keep the similarity index under the required threshold (e.g., below 10), while remembering similarity is not identical to plagiarism.

  6. 6

    Verify exclusions in the generated report: excluded sources, excluded bibliography, and excluded phrases should appear as marked items.

  7. 7

    When re-running analysis, ensure the folder’s exclusion settings are updated; re-submitting without updating settings can leave similarity unchanged.

Highlights

Exclusions are configured at the project/folder level, and the final report transparently lists excluded sources, excluded bibliography, and excluded phrases.
A “duplicate phrases” range (example: 3 to 14 words) helps remove repeated technical phrases that otherwise keep similarity above the threshold.
The transcript repeatedly separates similarity index requirements from plagiarism judgment—similarity is a metric, not a direct verdict.
The workflow includes both report generation and certificate/print outputs, with options tied to document type (thesis/dissertation/article).

Topics

  • Plagiarism Similarity
  • UGC Guidelines
  • Filter Exclusions
  • Duplicate Phrases
  • Report Verification