CSC 590 Lecture Notes Week 5
Further Discussion of How to Evaluate a Thesis



  1. See again the writeup for Assignment 2.

  2. Specific requirements for Assignment 2.

    1. DUE: 7PM Monday 13 May

    2. You are to evaluate two MS theses, at least one in your chosen area of research.

      1. One from Poly.

      2. Another from Cal Poly, or from a comparable MS-granting university, such as another CSU.

    3. Format of evaluation:
      Title of Thesis:
      Author:
      Date of Publication:
      Institution:
      Type of thesis (project, experimental, theoretical, survey, other (specify)):
      Area of work(e.g., AI, Distributed, Networks, SE): 1. Problem definition
      Score: (1 - 5, fractional scores are OK)
      Justification and critique ...
      Sections 2 through 9 cover assessment criteria "Writing Quality" through "Potential for future research", as shown on the sheet.
      10. Overall Quality
      Score: (1 - 5, fractional scores are OK)
      Justification and critique ...


  3. Other useful things to look for in a thesis.

    1. Is it generally well organized?

    2. Is it interesting, i.e., is it a "good read"?

    3. Did you learn something having read it?


  4. Thesis organizational requirements, and related guidelines.

    1. All CSC faculty have guidelines for the MS theses undertaken by their advisees.

    2. For example, I have a generic thesis outline for a project-oriented thesis; it's posted here: http://users.csc.calpoly.edu/~gfisher/students/generic-ms-outline.html

    3. Other faculty have similar guidelines.


  5. What the organizational guidelines all have in common:

    1. Chapter 1: State the problem, including a definition of the scope of the work.

    2. Chapters 2 - N: Present the solution, in a manner appropriate to the subject matter.

    3. Chapter N+1: Present conclusions and future work, including summary of contributions.

  6. Evaluation examples.

    1. There is one complete example here: 590/examples/assignment2/thesis-eval.pdf

    2. What follows in these notes are summary evaluations for the four most typical types of thesis submitted by Cal Poly MS students in the Computer Science department:

      1. Project, with experimental validation.

      2. Project, with proof-of-concept validation.

      3. User survey, with statistical validation.

      4. Evaluative survey, with rhetorical validation.


  7. Front matter.

    1. Title: "Localized Type Inference of Atomic Types in Python"
      Author: Brett Cannon
      Date: June 2005
      Institution: Cal Poly
      Type: project, experimental
      Area: programming languages

    2. Title: "Lojban as a Tool for Encoding Prose on the Semantic Web"
      Author: Brandon Wirick
      Date: November 2005
      Institution: Cal Poly
      Type: project
      Area: AI

    3. Title: "Prototyping in Industry: A Survey of Prototyping Use in Software Development"
      Author: Julia Smith
      Date: June 2004
      Institution: Cal Poly
      Type: statistical survey
      Area: software engineering

    4. Title: "Audio Codecs for Remote Radio Broadcasting"
      Author: Randy Scovil
      Date: June 2005
      Institution: Cal Poly
      Type: survey
      Area: communications

  8. Organizational comparison.

    1. "Type Inference"

      1. Introduction

      2. Type Inference Algorithms

      3. Challenges of Inferring Types in Python

      4. Previous Attempts to Infer Types in Python

      5. Inferring Atomic Types in Local Namespace

      6. Type-Specific Bytecodes

      7. Experiments & Results

      8. Conclusion

      9. Future Work

    2. "Lojban"

      1. Introduction

      2. Background

      3. Literature Review

      4. Solutions

      5. Details of Lojban

      6. Parsing Prose According to Lojban Semantics

      7. Concluding Analysis and Results
        Appendix A: Source Code for Prose Parser
        Appendix B: Results of Test Cases

    3. "Prototyping"

      1. Introduction

      2. Background

      3. Previous Work

      4. The Survey

      5. Survey Results

      6. Conclusions and Future Research
        Appendix A: Survey Coding


    4. "Codecs"

      1. Introduction

      2. Digital Audio Fundamentals

      3. Encoding Audio

      4. Audio Compression

      5. Speech Codecs

      6. Radio Remote Broadcasting

      7. Codec Criteria

      8. Codec Evaluation

      9. Conclusions and Recommendations


  9. Evaluation details.

    1. "Type Inference" (see the Assignment 2 example for the complete details).

      1. Problem definition:
        Score: 5
        • Specifies very specific goal for runtime performance increase -- 5% improvement.
        • Lays out the subsidiary technical problems very thoroughly.
        • Delivers a good solution, even though performance-improvement goal is not achieved.

      2. Writing quality
        Score: 3.8
        • Uses "royal we", and some colloquial language.
        • Some long-winded passages, e.g., "Specifically, we explore ... "

      3. Contribution
        Score: 4
        • Contributes a valuable piece of information, though not extremely ground- breaking.
        • The concluding chapter does a good job of summarizing the contributions.
        • There are 26 references, to very appropriate places.
        • Chapter 4 specifically compares and contrasts the approach of the thesis to others.

      4. Originality
        Score: 4
        • It's an interesting specific idea, though, again, not ground breaking.
        • It's doing research for Python that's been done in a number of other languages.

      5. Technical depth
        Score: 5
        • The review coverage of type inference is suitably deep.
        • The design and implementation components are non-trivial.
        • Chapters 2-6 provide very thorough discussion of type theory, type checking design and implementation.

      6. Implementation
        Score: 5
        • The implementation is fully functional, and meets the requirements set out in the introduction.
        • Chapters 5 and 6 cover the implementation in depth.

      7. Validation
        Score: 5
        • The experimentation is well set up, uses appropriate benchmarks, and gathers real data.
        • A solid chapter 7 presents the experimental results.
        • Unfortunately, the performance improvement goal was not achieved.

      8. Publication Potential
        Score: 3
        • A partially-refereed version of the results appeared in the PyCon conference.
        • Due to negative results, and similarity to work in other languages, it's doubtful that the work could be published in a top-flight conference, such as SIGPLAN.
        • Without additional work and better results, it's not journal publishable, e.g., TOPLAS.

      9. Future Potential
        Score: 5
        • There is definitely potential for future research, given the negative results.
        • A full chapter (9) does a good job of laying out what specifically could be done.

      10. Overall
        Score: 4.8
        • Despite the negative performance results, it's a solid piece of work.


    2. "Lojban"

      1. Problem definition:
        Score: 3
        • The Lojban-related aspect of the problem is well defined, specifically by THE research question on page 6 of Chapter 1:
          "Can the predicate relationships contained within an arbitrary piece of prose, written in some non-trivial subset of Lojban, be automatically extracted and made available to the Semantic Web in a format that complies with a single, static ontology?"
        • However, the overall scope of the problem is arguably too ambitious for an MS thesis.
        • There is a lot of high-powered research on semantic ontologies, and Brandon is playing in a very big league here.
        • And given the overly broad scope, it is essentially impossible to deliver anywhere near a complete solution.

      2. Writing quality
        Score: 5
        • Written in a very clear and engaging style, throughout.
        • Example of a particularly well-written passage "lo zukte cu sibdo fe lo gunma". ;)

      3. Contribution
        Score: 3
        • In terms of an idea, the thesis makes a very interesting contribution.
        • But, chronically, the broad scope of the project means that the specific technical contribution is small.
        • There are 39 references to appropriate literature sources.
        • There is some ontology work from CMU and MIT that is notably uncited.
        • A good number of references are to websites instead of refereed literature, and unnecessarily so.
        • Two chapters do a good job of reviewing the literature, with comparisons to the approach proposed in the thesis.

      4. Originality
        Score: 5
        • The thesis presents a very original idea.
        • Viz., the use of a highly-structured but still natural language as an intermediate representation for semantic information.

      5. Technical depth
        Score: 2
        • The overall subject matter is very deep.
        • However, the specification technical contribution of parsing Lojban is superficial.

      6. Implementation
        Score: 1.5
        • The implementation is the weakest part of the thesis.
        • As noted above, the parsing implementation is very simple, and there is only a single chapter (6) devoted to it.

      7. Validation
        Score: 2
        • Given the highly-ambitious nature of the problem, the validation is rather weak.
        • To demonstrate the results vis a vis other ontology frameworks would be a huge task.
        • Chapter 7 and Appendix B present some simple test cases, but the presentation is far from thorough.

      8. Publication Potential
        Score: 4
        • With some further work (probably quite a lot of it), the work is probably publishable in AAAI or IJCAI.
        • A PhD-level treatment could be worthy of AI journal publication.

      9. Future Potential
        Score: 5
        • As has been noted, this is an extremely interesting idea that has much potential for further work.
        • The thesis does not really do justice to this potential, focusing Chapter 7 on specific technical work related to Lojban parsing.

      10. Overall
        Score: 3
        • A very interesting idea, underdelivered upon.


    3. "Prototyping"

      1. Problem definition:
        Score: 2.5
        • The specific problem is to conduct a survey of professional software engineers to determine to what extent they use prototyping.
        • A weakness with the problem statement is an imprecise definition of precisely what prototyping is.
        • In this sense, the the results of the study define the problem, rather than the other way around.

      2. Writing quality
        Score: 3
        • There is some lack of clarity in the introduction; the quality later chapters is entirely adequate.
        • An example poor passage, from beginning of chapter 1: "Prototyping is proven to be useful ... . However, when hearing the word ``prototype'' most developers think about prototyping the user interface (UI). In a school project of mine ... "

      3. Contribution
        Score: 2.5
        • The statistical results of the survey are informative.
        • However, the lack of clear prototyping definition detracts.
        • There are 35 references to some reasonable places.
        • However, there are significantly important authors (Schneiderman) and tools (Macromedia) that are unreferenced.
        • A comparison of cited related work is covered adequately in Chapter 3.

      4. Originality
        Score: 3
        • The idea of doing a survey in this particular area is sufficiently original, given that none has appeared in the recent literature.
        • There have been a number of related surveys in the not-too-distant past (within ten years).
        • There are no significant new ideas in the experimental methodology.

      5. Technical depth
        Score: 3
        • The statistical analysis is adequately deep.
        • The unsteady main definition means the depth of content in the questionnaire itself is less than it could be.

      6. Implementation
        Score: 4
        • The execution of the survey and statistical analysis are fine.
        • They are covered thoroughly in chapters 3 and 4.

      7. Validation
        Score: 5
        • Survey results and data analysis are covered well in Chapter 5.

      8. Publication Potential
        Score: 2
        • The work might be publishable in a general-purpose trade magazine, e.g., Computer World or Dr. Jobbs.
        • Given what I believe are flaws in the problem definition, I do not believe the work is publishable in a technical conference or journal, without some additional work.

      9. Future Potential
        Score: 2
        • Chapter 6 outlines some potential follow-on work, in the form of surveying a larger population.
        • However, the difficulty of reaching such a population is a significant hurdle for any future work.

      10. Overall
        Score: 3
        • The flaw in problem definition renders the results less significant than they could have been.


    4. "Codecs"

      1. Problem definition:
        Score: 4
        • The problem is to present a survey of a rapidly evolving field, where there are many competing technologies being brought to bear.
        • Chapter 1 makes the problem clear enough, but there is no single sentence or paragraph that says, in effect, "The problem is this ...".
        • The "this" includes the fact that given the rapid pace of development, there is a problem with even knowledgeable practitioners keeping abreast of all the developments.

      2. Writing quality
        Score: 5
        • Clearly written overall.
        • Section 1.2 provides a very nice time-line of developments in the field, that substantially clarifies the material presented in subsequent chapters.

      3. Contribution
        Score: 4
        • The presented material is valuable to those interested in the subject matter.
        • The concluding chapter could have offered some further comparative analysis, and suggested possible future trends.
        • There is an extensive bibliography with 72 citations.
        • Given the nature of the survey, a comparison of the thesis work to that of others is not appropriate.
        • The thesis does to a good job of comparing and contrasting the cited works.

      4. Originality
        Score: 4
        • No such survey has appeared in the literature to my knowledge.
        • The lack any manner of new idea, or even prediction of future trends, detracts in this area.

      5. Technical depth
        Score: 4.5
        • The depth of the material covered in chapters 2 through 6 is good.
        • Again, the lack of any specific new ideas detracts here as well.

      6. Implementation
        Score: 5
        • Chapters 7 and 8 present codec evaluation criteria, and a thoughtful application of them to the cited work.

      7. Validation
        Score: 5
        • The evaluative analysis in Chapter 8 is good.

      8. Publication Potential
        Score: 4
        • A subset of the work might well be publishable in a general-readership journal, such as CACM or IEEE Computer.
        • There is a possibility for ACM Computing Surveys

      9. Future Potential
        Score: 1
        • By their nature, comparative surveys do not have much immediate potential for future work.
        • There is certainly potential for plenty of new work in the field, but the thesis does not point to any that would derive directly from it.

      10. Overall
        Score: 4.5
        • Overall, an informative survey.
        • As a thesis, some contribution of an original idea, in some form, would have made it much better.




index | lectures | handouts | examples | reference