Sharp code

Thursday, January 18, 2024

BOB - Best Of Breed Architecture

What are the characteristics of good software?

Its working the way it should
Its being used
Easy to changed

Easy to add code.
Easy to remove code.

Easy to debug.

Easy == fast.
Fast == more time to do interesting things

High evolutionary code allow us to experiment new thing end evolve.

How to achieve good design - General concepts

Interfaces

"Program to an 'interface', not an 'implementation'."
Favor 'object composition' over 'class inheritance'."
(Design Patterns: Elements of Reusable Object-Oriented Software 1994)

Delay decisions and mock implementation of component
related to resources and not to the core business.

Know your boundaries. When you need to cross boundary do it with interfaces and injection

Principle of least knowledge
Every part of your code should know the minimum about other component.
It better know the behavior of other part and not its internals.

Law of Demeter (LoD)

Open/Close

If adding features or changing resources successfully made without changing code
and only by adding code its a good indication that the design is good.

Data Driven Decision

When you don't have data to make a design decisions, make the minimum possible implementation in order to collect data, only then continue with the design.

Make it work, make it right, make it fast

Agile

Small iterations

Review the design with your co-workers

Never stop learning.

DDD - Domain driver design

"placing the project's primary focus on the core domain and domain logic"
In real world every component is a domain, Using AWS, Google API, dealing with Database,
Handling HTTP communication.
In order to focus on each domain problem we need to separate each domain to a different projects, (micro projects), of services and libraries in order to focus on each project core.
Later we see how we stitch all this projects to one application.

Onion architecture, Clean architecture

Both architectures emphasize on knowing components only in one direction.
A -> B -> C, C can not know about A nor B.
If you can add new functionality to the begining of the chain that be better.

Tests

Test your code with mock resources

Two kinds of functions

1. Logic functions should be pure. Get all input from outside and return always the same result.
2. Feature functions holds the feature flow, initiate resources and invoke logic functions.

Test should lloks like feature function with different resources.

Fast return

if (!valid){
return
}
do some other stuff
return

Profile and Benchmark:

"It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail". (Donald Knuth)

Vocabulary

Core: The code that only you write.

Libraries: Code created or imported and can be used by other application.
Usually its better to find libraries in the web then develop.
If you need to develop a library it can be open sourced

Service: a library that run in the background.
As libraries its better to find services in the web then develop one...

Entry point:
Responsible for the order of initializing services, create library instances,create core entities and initialize an application

Application: Initialize features

Feature coordinator: also known as controller, flow function, orchestrator
Manage the flow of different function that assemble the feature.
A good feature component looks like a requirement document.
Features test is actually an acceptance tests.

Feature example:
1 flow - if not login then login,browse items,select item, check inventory, if exists add to cart, pay, ship.
2 flow - browse items,select item,add to cart,ask to pay,if not login then login, pay, check inventory if item exists ship it else order it

Each step function didn't changed the core functionality only the order its been executed.

Boundaries

Entry point (1/3)
initialize resources
IO Device - every things that come from persistence.
Utilities.
Analytics.
Application.

Application (2/3)
initialize features
fl = FeatureList(IODevice,Analitics)
fap = FeatureAddPost(IODevice,Analitics,Utilities)
fup = FeatureUpdatePost(IODevice,Analitics,Utilities)
fgp = FeatureGetPost(IODevice,Analitics)

Application (3/3)
initiate route by binding incoming messages to a route feature function
Initialize display
Validate incoming message.
Gather transport data like login user details
Invoke features
return the feature result to a display device

Feature
Manage the feature flow only with the data and resources it got from the entry point.

Core component
Manage task logic.
Validate post, Save Post to persistence, Search for post by several arguments, etc...

Each libraries or service the feature use is a core component (A domain).
Some of the libraries are open source projects like analytics and augment.
And some written specifically for the solution, storage and Posts for example.
They both injected into the feature and looks the same to it.

* Code

* Anti patterns
Use a design as dogma. think about what you are doing and take an educated decision.

Sticking to a design when you find a better way to do things.

A new requirement will come that force you to break the API, and violate the open/close principle.
Its okay but make an optioned decision about it.

Don't over engineer, its okay to write dirty code, specially if it within a boundary

Don't design a prof of concept.

Don't use the prof of concept code in production.

Don't use sample code from the web in production.

DRY is over rated

Limit your research at some point its better writng poor code then not writing at all.

Thinking that a problem is simple.

Conclusion

Creating a clear boundaries let you focus on a single problem and solve it without messing up other component.

Testing in small and in focus.

Easy to refactor.

Easy to break the part that need to be remote services.

Tuesday, March 15, 2016

Java vs Golang - Parameterized tests

Although I like Java, I think that with all its drawback it brings years of knowledge and optimization, I find too many times that when you "think in Java" you tend to creat complicated solutions. By complicated I mean hard for a developer to grasp and to remember.

In the following example we can see several differences that make the Java code look `old` and clumsy and not fun.
1. 7 imports VS 1 in Golang.
2. Everything is a class. Fibonacci is a function in Golang.
3. Why hiding from the developer the fact that the test runs in a loop?
4. All these annotations.

Simplicity matter!
When you try to produce a simple code you end up with less dependency, less code, less magic, less moving parts and less things to know in order to produce quality code.
And as always, but especially in programming, Less Is More.

Simplicity Matters by Rich Hickey

The Java way from JUnit-team

 import static org.junit.Assert.assertEquals;  
 import java.util.Arrays;  
 import java.util.Collection;  
 import org.junit.Test;  
 import org.junit.runner.RunWith;  
 import org.junit.runners.Parameterized;  
 import org.junit.runners.Parameterized.Parameters;  
 @RunWith(Parameterized.class)  
 public class FibonacciTest {  
   @Parameters  
   public static Collection<Object[]> data() {  
     return Arrays.asList(new Object[][] {     
          { 0, 0 }, { 1, 1 }, { 2, 1 }, { 3, 2 }, { 4, 3 }, { 5, 5 }, { 6, 8 }   
       });  
   }  
   private int fInput;  
   private int fExpected;  
   public FibonacciTest(int input, int expected) {  
     fInput= input;  
     fExpected= expected;  
   }  
   @Test  
   public void test() {  
     assertEquals(fExpected, Fibonacci.compute(fInput));  
   }  
 }  
 public class Fibonacci {  
   public static int compute(int n) {  
     int result = 0;  
     if (n <= 1) {   
       result = n;   
     } else {   
       result = compute(n - 1) + compute(n - 2);   
     }  
     return result;  
   }  
 }

The Golang way

 package Fibonacci  
 import "testing"  
 func TestFibonacci(t *testing.T) {  
      parameters := []struct {  
           input, expected int  
      }{  
           {0, 0}, {1, 1}, {2, 1}, {3, 2}, {4, 3}, {5, 5}, {6, 8},  
      }  
      for i := range parameters {  
           actual := Fibonacci(parameters[i].input)  
           if actual != parameters[i].expected {  
                t.Logf("expected%d: , actual:%d", parameters[i].expected, actual)  
                t.Fail()  
           }  
      }  
 }  
 func Fibonacci(n int) (result int) {
 if n <= 1 {
  result = n
 } else {
  result = Fibonacci(n-1) + Fibonacci(n-2)
 }
 return
}

Wednesday, March 2, 2016

MySql huge tables chunked delete

In order to avoid table locking while deleting data from huge tables, it's recommended to delete chunks of rows.

The following code also adds two additional gourds.

Date guard: It's impossible to delete data newer than 3 months

Time guard: If the process takes more than 10 minutes it quite.

1. First, we create a dynamic query utility procedure.

 CREATE PROCEDURE `dynamicQuery`(query_string text)

 BEGIN

     set @st=query_string;

     PREPARE stmt FROM @st;

     EXECUTE stmt;

     DEALLOCATE PREPARE stmt;

 END

2. For debugging we'll add a log table.

 CREATE TABLE `log` ( 

  `min` int(11) DEFAULT NULL, 

  `max` int(11) DEFAULT NULL, 

  `target` date DEFAULT NULL, 

  `scan` date DEFAULT NULL, 

  `scanid` int(11) DEFAULT NULL, 

  `counter` int(11) DEFAULT NULL, 

  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, 

  `procid` int(11) NOT NULL DEFAULT '0', 

  `object` varchar(63) DEFAULT NULL 

 ) ENGINE=MyISAM;

3. Create delete_until procedure.

 CREATE PROCEDURE `delete_until`(tbl_name varchar(63),until_date DATE)  
 main: BEGIN  
   SET @TNAME=tbl_name;  
   SET @TARGET=until_date;  
   SET @MX=0;  
   SET @MN=0;  
   SET @BATCH=7000;  
   SET @VALIDTARGET=1;  
   SET @DELETED=-100;  
   SET @PROC=connection_id();  
   SET @ETSTART=UNIX_TIMESTAMP();  
   SET @TIMEOUT=600; -- 10 minutes  
   -- 3 month guard. Its impossible to delete data for the last three month  
      if datediff(date(now()),@TARGET) < 91 then  
           select 'only delete data that is older then three month';  
           LEAVE main;  
   end if;  
      SET @DELETED=0;  
      -- Get min and max id  
      SET @q=CONCAT("SELECT max(id), min(id) into @MX,@MN from ",@TNAME);  
      call dynamicQuery(@q);  
      -- Check the min id is older then @TARGET  
      SET @cond=CONCAT("SELECT EXISTS ( SELECT id FROM ",@TNAME," WHERE id = @MN AND created_at < @TARGET ) into @VALIDTARGET;");  
      call dynamicQuery(@cond);  
      -- Loop from min to max in 7000 batch  
      deleteloop: WHILE @VALIDTARGET>0 DO  
           -- Delete rows   
           SET @q=CONCAT("delete from ",@TNAME," where id between @MN and @MN+@BATCH and created_at < @TARGET;");  
           call dynamicQuery(@q);  
           -- Accumulate rows for summary  
           SET @DELETED = @DELETED+@BATCH;  
           -- Promote @MN Iterator  
           SET @MN=@MN+@BATCH+1;  
           -- If the largest id in the next iteration is newer then @TARGET, exit condition is met. @VALIDTARGET=0.  
           call dynamicQuery(@cond);  
           -- insert to log  
           insert into log(object,procid,min,max,target,scan,scanid,counter) values(@TNAME,@PROC,@MN,@Mx,@TARGET,null,@VALIDTARGET,@DELETED);  
           SET @ETEND=UNIX_TIMESTAMP();  
           if @ETEND-@ETSTART > @TIMEOUT then -- Ten minutes timeout guard exceeded.  
                LEAVE deleteloop;  
           end if;  
      END WHILE;  
      SELECT @DELETED;  
 END

4. A chunk insert as a bonus.

 CREATE PROCEDURE `chunk_insert`(tbl_source varchar(63),tbl_target varchar(63)) 

 BEGIN 

   SET @TOTABLE=tbl_target; 

   SET @FROMTABLE=tbl_source; 

   SET @MX=0; 

   SET @MN=0; 

   SET @BATCH=7000; 

   SET @TOID=0; 

   SET @LASTLOOP=0; 

   SET @INSERTED=0; 

   SET @PROC=connection_id(); 

      -- Get min and max id from remote table 

   SET @q=CONCAT("SELECT max(id), min(id) into @MX,@MN from ",@FROMTABLE); 

   call dynamicQuery(@q); 

   -- Loop from min to max in 7000 batch 

   WHILE @LASTLOOP=0 DO 

           IF @MN+@BATCH > @MX THEN 

                SET @LASTLOOP=1; 

                SET @TOID=@MX; 

       SET @INSERTED=@INSERTED+(@MX-@MN); 

           ELSE 

                SET @TOID=@MN+@BATCH; 

       SET @INSERTED=@INSERTED+@BATCH; 

           END IF; 

     insert into log (procid,min,max,target,scan,scanid,counter) values(@PROC,@MN,@Mx,null,null,@TOID,@LASTLOOP); 

           SET @q=CONCAT('insert into ',@TOTABLE,' select * from ',@FROMTABLE,' where id between @MN and @TOID;'); 

     call dynamicQuery(@q); 

     SET @MN=@TOID+1; 

   END WHILE; 

   select @INSERTED; 

 END

5. Usage: Delete data that is 5 months old from table `my_huge_table_name`

call delete_until('my_huge_table_name',DATE(NOW() - INTERVAL 5 MONTH));

Wednesday, December 30, 2015

MySql dump - exclude tables by pattern

In this example we want to ignore tables that prefix with lhm_ or _(underscore)

Create bash script with the following lines

#Generate --ignore-table clause
IGNORE=`mysql -A -N -umyuser -pmypass -hmyserver -e"select group_concat(DISTINCT d.ig ORDER BY d.ig ASC SEPARATOR ' ') from (select concat('--ignore_table=myDatabase.',table_name) as ig from information_schema.tables where (table_name like 'lhm_%' or table_name like '\_%') and table_schema='mydatabase') d;"`

#Run mysqldump with the generated $IGNORE
mysqldump -umyuser -pmypass -hmyserver --routines --no-data --compact --triggers --hex-blob $IGNORE mydatabase -r 'dump.sql'

Thursday, July 18, 2013

Hebrew encoding

Fix encoding

node-iconv -t UTF-8 -f ISO-8859-8 in.srt > out.srt

Here are some guidelines to choose the from (-f) encoding

בשלוש הדוגמאות האלו רואים עברית הפוכה. שימו לב שבדוגמה הראשונה סימן הקריאה הוא בצד ימין ובשתי הדוגמאות האחרונות הוא בצד שמאל.

זה קידוד CP-1255 הוצג כ- ISO-8859-8

זה ISO-8859-8 הוצג כ- CP-1255

זה ISO-8859-8 הוצג כ- MacHebrew

בעית קידוד נוספת שגורמת לתופעה דומה היא ציון ISO-8859-8 במקום ISO-8859-8-I או להפך. אז תבדקו גם את האפשרות הזאת.

ארבע הדוגמות הבאות כוללות ביתים שגויים, לכן הופעת סימן היהלום. שימו לב שרק בדוגמה האחרונה מופיע סימן הקריאה בצד שמאל.

זה UTF-8 הוצג כ- CP-1255

זה UTF-8 הוצג כ- ISO-8859-8

זה CP-1255 הוצג כ- UTF-8

זה ISO-8859-8 הוצג כ- UTF-8

המצב הזה קל לזיהוי בגלל הצלבים המפרידים בין סימן לסימן.

זה UTF-8 הוצג כ- ISO-8859-1

הכתב ה"שבדי" הוא תוצאה של הצגת קידודים אחרים בתור קידוד לטיני:

זה CP-1255 הוצג כ- ISO-8859-1

זה ISO-8859-8 הוצג כ- ISO-8859-1

מצב זה הוא די נדיר. כאן רואים קידודים שונים המוצגים כקידוד של י.ב.מ.

זה CP-1255 הוצג כ- IBM-862

זה ISO-8859-8 הוצג כ- IBM-862

שני מצבים אלו גם כן נדירים. שניהם נגרמים כתוצאה מהצגת UTF-8 בתור קידודים שונים.

זה UTF-8 הוצג כ- IBM-862

זה UTF-8 הוצג כ- MacHebrew

http://gibberish.co.il/encoding.html

Friday, July 15, 2011

Make current directory DocumentRoot in apache

Create the following bash file and put it in the directory you want to use as apache root directory

#!/bin/sh
sed 's!\DocumentRoot .*!DocumentRoot '"`pwd`"'!' /etc/apache2/sites-available/default > default
sudo mv default /etc/apache2/sites-available/
sudo /etc/init.d/apache2 restart

Friday, April 8, 2011

Connect Oracle with CakePHP

Tested with Windows XP and XAMPP
1. Install XAMPP
2. Install OracleXEUniv
3. In php.iniuncomment the following line:
     extension=php_oci8.dll
4. Set app/config/database.php :
var $default = array('driver' => 'oracle',
             'connect' => 'oci',
            'persistent' => false,
            'host' => 'localhost',
            'port'=>1521,
            'login' => 'root',
            'password' => 'pass',
            'schema'=>'schama_name',
            'database' => 'localhost:1521/xe',
            'prefix' => ''
         );