31 January 2022

Bloom Filter

Bloom filter is a space-efficient probabilistic data structure for testing if an element is in a Set. It is a space-efficient probabilistic data structure.[1]

It cannot return False negative, nevertheless it can return false positives. So there is a probability that the element belongs to the set but cannot not be acute 100%.[8]

In other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with the counting Bloom filter variant); the more items added, the larger the probability of false positives.[1]

Use a hash function to map a key to a bucket. However, it will not store that key in that bucket, it will simply mark it as filled. So, many keys might map to same filled bucket, creating false positives.

Interestingly a Bloom filter can also trade accuracy for space.


An example of a Bloom filter, representing the set {x, y, z} . The colored arrows show the positions in the bit array that each set element is mapped to. The element w is not in the set {x, y, z} , because it hashes to one bit-array position containing 0. For this figure, m = 18 and k = 3. (wikipedia)

 

A Guava BloomFilter is created by calling the static method create on the BloomFilter class,
passing in a Funnel object and an int representing the expected number of insertions. A Funnel, also new in Guava 11, is an object that can send data into a Sink.[6]

//Creating the BloomFilter
BloomFilter bloomFilter = BloomFilter.create(Funnels.byteArrayFunnel(), 1000);

//Putting elements into the filter
//A BigInteger representing a key of some sort
bloomFilter.put(bigInteger.toByteArray());

//Testing for element in set
boolean mayBeContained = bloomFilter.mayContain(bitIntegerII.toByteArray());

 

https://github.com/devwebcl/java-samples/tree/master/src/main/java/cl/devweb/guava/bloomfilter


Bad sample where expected elemts is 5 and we insert 100,000, will give too many false positives:

BloomFilter<Integer> filter = BloomFilter.create(
 Funnels.integerFunnel(),
 5,
 0.01);
IntStream.range(0, 100_000).forEach(filter::put);

 

Funnel

A Funnel describes how to decompose a particular object type into primitive field values. For example, if we had

class Person {
 final int id;
 final String firstName;
 final String lastName;
 final int birthYear;
}

our Funnel might look like

Funnel<Person> personFunnel = new Funnel<Person>() {
 @Override
 public void funnel(Person person, PrimitiveSink into) {
 into
 .putInt(person.id)
 .putString(person.firstName, Charsets.UTF_8)
 .putString(person.lastName, Charsets.UTF_8)
 .putInt(birthYear);
 }
};


  1. https://en.wikipedia.org/wiki/Bloom_filter
  2. https://llimllib.github.io/bloomfilter-tutorial/
  3. https://www.baeldung.com/guava-bloom-filter
  4. https://www.geeksforgeeks.org/bloom-filters-introduction-and-python-implementation/
  5. https://github.com/google/guava/wiki/HashingExplained
  6. https://dzone.com/articles/using-guava-bloomfilter-guard 
  7. https://github.com/bbejeck/guava-blog/blob/master/src/test/java/bbejeck/guava/hash/BloomFilterTest.java
  8. https://www.i-programmer.info/programming/theory/2404.html 

 

PS: there is a The Invertible Bloom Filter !... ?

 

25 January 2022

The Bank of San Serriffe 2

After a long time, again my 2 cents for FG Knuth

Added:
+      Germán González-Morris           0x$6.40

Update 2025.01

                  Germán González-Morris 0x$6.60


https://www-cs-faculty.stanford.edu/~knuth/boss.html

13 January 2022

Jenkins directory structure

From official documentation:

JENKINS_HOME has a fairly obvious directory structure that looks like the following:

JENKINS_HOME
 +- config.xml     (jenkins root configuration)
 +- *.xml          (other site-wide configuration files)
 +- userContent    (files in this directory will be served under your http://server/userContent/)
 +- fingerprints   (stores fingerprint records)
 +- nodes          (slave configurations)
 +- plugins        (stores plugins)
 +- secrets        (secretes needed when migrating credentials to other servers)
 +- workspace (working directory for the version control system)
     +- [JOBNAME] (sub directory for each job)
 +- jobs
     +- [JOBNAME]      (sub directory for each job)
         +- config.xml     (job configuration file)
         +- latest         (symbolic link to the last successful build)
         +- builds
             +- [BUILD_ID]     (for each build)
                 +- build.xml      (build result summary)
                 +- log            (log file)
                 +- changelog.xml  (change log)

Although the version (legacy) I am using is a bit different.

09 January 2022

RequireUpperBoundDeps

Eventually, this issue would be faced if trying to compile Jenkins plugins: 

In the end, a dependency has a stale version (8.0.1) in this case.

[INFO] --- maven-enforcer-plugin:3.0.0-M3:enforce (display-info) @ azure-ad ---

[INFO] Adding ignore: module-info
[INFO] Ignoring requireUpperBoundDeps in com.google.guava:guava
[WARNING] Rule 5: org.apache.maven.plugins.enforcer.RequireUpperBoundDeps failed
 with message:
Failed while enforcing RequireUpperBoundDeps. The error(s) are [
Require upper bound dependencies error for org.ow2.asm:asm:5.0.4 paths to depend
ency are:
+-org.jenkins-ci.plugins:azure-ad:1.2.2
  +-org.jenkins-ci.main:jenkins-core:2.271
    +-com.github.jnr:jnr-posix:3.0.45
      +-com.github.jnr:jnr-ffi:2.1.8
        +-org.ow2.asm:asm:5.0.4 (managed) <-- org.ow2.asm:asm:5.0.3
and
+-org.jenkins-ci.plugins:azure-ad:1.2.2
  +-org.jenkins-ci.main:jenkins-core:2.271
    +-com.github.jnr:jnr-posix:3.0.45
      +-com.github.jnr:jnr-ffi:2.1.8
        +-org.ow2.asm:asm-tree:5.0.3
          +-org.ow2.asm:asm:5.0.4 (managed) <-- org.ow2.asm:asm:5.0.3
and
+-org.jenkins-ci.plugins:azure-ad:1.2.2
  +-com.microsoft.azure:azure:1.22.0
    +-com.microsoft.azure:azure-client-authentication:1.6.4
      +-com.microsoft.azure:adal4j:1.6.2
        +-com.nimbusds:oauth2-oidc-sdk:5.64.4
          +-net.minidev:json-smart:2.3
            +-net.minidev:accessors-smart:1.2
              +-org.ow2.asm:asm:5.0.4 (managed) <-- org.ow2.asm:asm:8.0.1
]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:08 min
[INFO] Finished at: 2022-01-09T10:12:15-03:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.
0.0-M3:enforce (display-info) on project azure-ad: Some Enforcer rules have fail
ed. Look above for specific messages explaining why the rule failed. -> [Help 1]

https://www.jenkins.io/doc/developer/plugin-development/updating-parent/

https://ourcraft.wordpress.com/2016/08/22/how-to-read-maven-enforcer-plugins-requireupperbounddeps-rule-failure-report/

 

 

02 January 2022

Jq cheat sheet

jq .
jq type
jq keys


Samples:

cat adb_shared.json | jq keys
cat adb_shared.json | jq .ADB_Instance
cat adb_shared.json | jq .ADB_Instance | head

linux: length (count)
cat adb_shared.json | jq ".ADB_Instance | length"

jq length:
(count)

jq '.data' zcluster.json | jq length

extraer key de cada elemento del array:
iterator operator .[]

jq '.data' zadb.json |  jq '.[] | .id'
jq '.data' zadb.json |  jq '.[] | ."defined-tags"'

jq '.data' zadb.json |  jq '.[] | ."db-name"'
 

jq '.data' zadb.json |  jq '.[] | ."db-name" + " " + (."is-dedicated"|tostring) + ", " + (."defined-tags"|tostring)'
 

Names with dashes:
You need to enclose in brackets and double quotes:

cat zsl-sample1.json | jq .data | jq '."ingress-security-rules"'

funcion: tostring
has to be inside of quotation mark.

terraform output -json | jq '.elastic_endpoint.value'

alias jqq='jq --color-output < swagger.json | less'

 
remove additionalProperties for a json

jq 'walk(if type == "object" and has("additionalProperties") then del(.additionalProperties) else . end)'  < sw1.json | grep -i add
 

Samples by quering json:

[
  {
    "Id": "cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b",
    "Names": [
      "condescending_jones",
      "loving_hoover"
    ]
  },
  {
    "Id": "186db739b7509eb0114a09e14bcd16bf637019860d23c4fc20e98cbe068b55aa",
    "Names": [
      "foo_data"
    ]
  },
  {
    "Id": "a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19",
    "Names": [
      "jovial_wozniak"
    ]
  },
  {
    "Id": "76b71c496556912012c20dc3cbd37a54a1f05bffad3d5e92466900a003fbb623",
    "Names": [
      "bar_data"
    ]
  }
]

contains:

devwebcl@devwebcl-PC ~/tmp
$ cat t.json | jq -c 'map(select(.Names[] | contains ("data"))) '
[{"Id":"186db739b7509eb0114a09e14bcd16bf637019860d23c4fc20e98cbe068b55aa","Names":["foo_data"]},{"Id":"76b71c496556912012c20dc3cbd37a54a1f05bffad3d5e92466900a003fbb623","Names":["bar_data"]}]

devwebcl@devwebcl-PC ~/tmp
$ cat t.json | jq -c 'map(select(.Names[] | contains ("data"))) | .[] .Id'
"186db739b7509eb0114a09e14bcd16bf637019860d23c4fc20e98cbe068b55aa"
"76b71c496556912012c20dc3cbd37a54a1f05bffad3d5e92466900a003fbb623"

Not contains:

devwebcl@devwebcl-PC ~/tmp
$ cat t.json | jq 'map(select(any(.Names[]; contains("data"))|not)|.Id)[]'
"cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b"
"a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19"

devwebcl@devwebcl-PC ~/tmp
$ cat t.json | jq '. - map(select(.Names[] | contains ("data"))) | .[] .Id'
"cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b"
"a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19"

https://github.com/stedolan/jq/wiki/Cookbook#filter-objects-based-on-the-contents-of-a-key


jq -s (slurp), concatenates json


--slurp (-s) key is needed and map() to do so in one shot

$ cat f1.json
{
  "records": [
    {"a": 1},
    {"a": 3}
  ]
}

$ cat f2.json
{
  "records": [
    {"a": 2}
  ]
}

$ jq -s 'map(.records[].a)' f?.json
[
  1,
  3,
  2
]

 

# concatenamos arrays:  
jq -s '.[0] + .[1]' sample1.json sample2.json



20 December 2021

Sonar Java Injection

Rules not assigned by default, but very useful:

 - https://rules.sonarsource.com/java/RSPEC-3749
  Members of Spring components should be injected

- https://rules.sonarsource.com/java/RSPEC-4288
  Spring components should use constructor injection
 



13 November 2021

Algorithms and Data Structures for Massive Datasets

Technical Reviewer:

Dzejla Medjedovic, Emin Tahirovic, and Ines Dedovic
MEAP began July 2020 Publication in January 2022 (estimated)
ISBN 9781617298035 325 pages (estimated) printed in black & white

https://www.manning.com/books/algorithms-and-data-structures-for-massive-datasets



16 September 2021

netcat - nc

netcat
ncat
nc


-n : skip DNS lookups
-u : Use of UDP mode (instead of TCP)
-v : Extensive output
-w : timeout (seconds)
-z : Port scanner mode (zero I/O mode); only listening services are scanned (no data is sent)

Scan port 123 for NTP:
nc -z -v -u 0.us.pool.ntp.org 123


scan ports:
nc -w 2 -z 192.168.10.1 1-1024

nc -v -n 8.8.8.8 1-1000




Blog Archive

Disclaimer

Qux