I’m sketching out an idea for a readability assessment program. It will report the education level required to comfortably read a body of text using formulas, Dale-Chall being the most significant, that count length of sentences, what level of vocab a word is considered to be, etc. I was inspired by the word counter website I always paste my essays into. When it’s done, I would like to plug it into APIs for it to be used on Lemmy, Mastodon, and Discord.

    • Parenti Bot@lemmygrad.mlB
      link
      fedilink
      arrow-up
      16
      ·
      1 year ago
      The quote

      In the United States, for over a hundred years, the ruling interests tirelessly propagated anticommunism among the populace, until it became more like a religious orthodoxy than a political analysis. During the Cold War, the anticommunist ideological framework could transform any data about existing communist societies into hostile evidence. If the Soviets refused to negotiate a point, they were intransigent and belligerent; if they appeared willing to make concessions, this was but a skillful ploy to put us off our guard. By opposing arms limitations, they would have demonstrated their aggressive intent; but when in fact they supported most armament treaties, it was because they were mendacious and manipulative. If the churches in the USSR were empty, this demonstrated that religion was suppressed; but if the churches were full, this meant the people were rejecting the regime’s atheistic ideology. If the workers went on strike (as happened on infrequent occasions), this was evidence of their alienation from the collectivist system; if they didn’t go on strike, this was because they were intimidated and lacked freedom. A scarcity of consumer goods demonstrated the failure of the economic system; an improvement in consumer supplies meant only that the leaders were attempting to placate a restive population and so maintain a firmer hold over them. If communists in the United States played an important role struggling for the rights of workers, the poor, African-Americans, women, and others, this was only their guileful way of gathering support among disfranchised groups and gaining power for themselves. How one gained power by fighting for the rights of powerless groups was never explained. What we are dealing with is a nonfalsifiable orthodoxy, so assiduously marketed by the ruling interests that it affected people across the entire political spectrum.

      – Michael Parenti, Blackshirts And Reds

      I am a bot, and this action was performed automatically. Please contact the admins of this instance if you have any questions or concerns.

    • rufuyun@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      I’ll probably use this, if I get that far with this project. Since I plan for the bots to be the last thing I add, after the CLI can do everything I want it to. Thank you!

  • rufuyun@lemmygrad.mlOP
    link
    fedilink
    arrow-up
    4
    ·
    1 year ago

    BTW, do you guys think I should use databases for this? The one formula uses a list of 4,000 easy words, and storing lists of common proper nouns will help with flagging them. Also, I could probably get vocab level data for tens of thousands of words… better in a DB than a ginormous hash table or trie?

    • arbitrary@lemmygrad.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      With that small of a dataset imo either option is fine. If it were me I would use an ORM + sqlite just to start, in case I ever needed to migrate to a “real” database.

      • rufuyun@lemmygrad.mlOP
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        1 year ago

        Thank you!

        ORM + sqlite

        I am writing in C (the CLI, which I’ll just have the bots use) and have never used any databases, would using the sqlite interface straightup with C and some cursory reading of docs be too much, do you think? Course I can switch it all to c++ and then there appears to be at least one nice ORM

        • arbitrary@lemmygrad.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          I think if you’re storing vocabulary etc, using the C interface for sqlite wouldn’t be too unwieldy and would be a good learning experience if you haven’t done much raw SQL query writing of your own. Even when you use an ORM there are often times you need to write your own queries for more complicated situations.

          One other suggestion: once you have the CLI and bots working, you could abstract this even more. Have a service process that communicates in some way (IPCC, a network port, etc.) that does the actual text analysis. Your cli and bots can then just interface over that channel. This gives separation of duties so you can easily implement new clients/servers or rework them much more easily.